Beta Support for XDNA2 Platform by Victor-Jung · Pull Request #179 · pulp-platform/Deeploy

Victor-Jung · 2026-03-24T16:12:19Z

This PR adds beta support for the XDNA2 platform. This is also the first attempt at adding a new MLIR backend for Deeploy.

Currently, it passes a single Add node ONNX graph; on this platform, there is no such "untiled" execution possible, as the AIE cores can't read outside their local L1. Hence, the XDNA platform is inherently tiled.

Currently, the CI uses self-hosted runners to run directly on the XDNA2 NPU from a Minisforum X1 Pro.

Added

A Dockerfile specifically for the XDNA platform. It only requires the mlir-aie and llvm-aie packages installable with pip. You have to run this container with specific arguments to forward the NPU device and driver libraries; the procedure is described in README_XDNA.md.
A new target library for XDNA2. It uses the LLVM compiler for AIE (named Peano) to compile the Add kernel.
MLIRNodeTemplate and MLIRCodeTransformation base classes in MLIRDataTypes.py — dialect-agnostic abstractions for MLIR-emitting backends (two-phase: device passes + runtime-sequence passes).
XDNA2Deployer that constructs a single mlir.ir.Module using the aie.dialects Python API instead of rendering Mako C templates.
Code transformation passes: MLIRObjectFifoPass (ObjectFifo creation + kernel declaration), MLIRComputeCorePass (tiling loop + acquire/release), MLIRRuntimeSequencePass (shim DMA configuration).
Auto-tiling with L1 memory constraints via XDNA2AddTileConstraint.
XRT-based C++ testbench with BF16 ULP tolerance comparison.
GitHub Actions workflow to build and publish the XDNA container image.
CI workflow on self-hosted runner (_runner-xdna2.yml).
requirements-xdna.txt to decouple XDNA pip dependencies from base requirements.

Changed

aie.dialects API: link_with moved from aie_d.core() to aie_d.external_func() (mlir-aie v1.3.2)
aie import is optional, non-XDNA users don't need mlir-aie installed

PR Merge Checklist

The PR is rebased on the latest devel commit and pointing to devel.
Your PR reviewed and approved.
All checks are passing.
The CHANGELOG.md file has been updated.
If the docker was modified, change back its link after review.

Summary by CodeRabbit

Release Notes

New Features
- Launched XDNA2/AIE2p hardware deployment platform with integrated MLIR code generation backend
- Automatic memory-aware tiling with L1 memory-level constraints for optimized tensor operations
- XRT-based hardware execution framework with integrated validation against golden reference outputs
Documentation
- Added comprehensive XDNA2 setup guide, Docker containerization, and CI/CD infrastructure

coderabbitai · 2026-04-09T11:34:02Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2ebbf7f4-adb1-4423-8073-2f5d4ddb78cb

📥 Commits

Reviewing files that changed from the base of the PR and between cde0c5d and b8e9e34.

📒 Files selected for processing (38)

.github/workflows/_runner-xdna2.yml
.github/workflows/ci-platform-xdna2.yml
.github/workflows/docker-build-deeploy-xdna.yml
.gitignore
CHANGELOG.md
CMakeLists.txt
Container/Dockerfile.deeploy-xdna
Deeploy/MLIRDataTypes.py
Deeploy/Targets/XDNA2/Bindings.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py
Deeploy/Targets/XDNA2/Deployer.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Platform.py
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
Deeploy/Targets/XDNA2/Templates/__init__.py
Deeploy/Targets/XDNA2/Tiler.py
Deeploy/Targets/XDNA2/TypeCheckers.py
DeeployTest/Platforms/XDNA2/CMakeLists.txt
DeeployTest/Platforms/XDNA2/main.cpp
DeeployTest/Tests/Kernels/BF16/Add/Regular/inputs.npz
DeeployTest/Tests/Kernels/BF16/Add/Regular/network.onnx
DeeployTest/Tests/Kernels/BF16/Add/Regular/outputs.npz
DeeployTest/conftest.py
DeeployTest/deeployRunner_xdna2.py
DeeployTest/generateNetwork_xdna2.py
DeeployTest/testUtils/core/execution.py
DeeployTest/testUtils/deeployRunner.py
DeeployTest/testUtils/platformMapping.py
DeeployTest/testUtils/testRunner.py
DeeployTest/test_platforms.py
DeeployTest/test_xdna2_config.py
README_XDNA.md
TargetLibraries/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/kernels/add.cc
requirements-xdna.txt

✅ Files skipped from review due to trivial changes (4)

.gitignore
Deeploy/Targets/XDNA2/Templates/init.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Tiler.py

🚧 Files skipped from review as they are similar to previous changes (14)

CMakeLists.txt
.github/workflows/ci-platform-xdna2.yml
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
DeeployTest/deeployRunner_xdna2.py
Deeploy/Targets/XDNA2/Bindings.py
DeeployTest/testUtils/deeployRunner.py
DeeployTest/test_platforms.py
Deeploy/Targets/XDNA2/Deployer.py
TargetLibraries/XDNA2/kernels/add.cc
DeeployTest/Platforms/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/CMakeLists.txt
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
DeeployTest/conftest.py
DeeployTest/test_xdna2_config.py

📝 Walkthrough

Walkthrough

Adds full XDNA2 (AIE2p NPU) platform support to Deeploy. Introduces MLIR base types, three AIE code transformation passes (ObjectFifo, ComputeCore, RuntimeSequence), a vectorized BF16 Add AIE kernel, an XRT C++ testbench, MLIR-aware deployer, network generation scripts, test framework integration, Dockerfile, and GitHub Actions CI workflows.

Changes

XDNA2 Platform Support

Layer / File(s)	Summary
MLIR base types, type checker, and Add template `Deeploy/MLIRDataTypes.py`, `Deeploy/Targets/XDNA2/TypeCheckers.py`, `Deeploy/Targets/XDNA2/Templates/AddTemplate.py`, `Deeploy/Targets/XDNA2/Templates/__init__.py`	`MLIRExecutionBlock`, `MLIRCodeTransformationPass`, `MLIRCodeTransformation`, and `MLIRNodeTemplate` establish the shared base contracts. `XDNA2AddChecker` (BF16 `SignPropTypeChecker`) and `XDNA2AddTemplate` (emits a single `func_d.call` to `eltwise_add_bf16_vector`) implement the first operator.
XDNA2 AIE code transformation passes `Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py`, `...MLIRComputeCorePass.py`, `...MLIRRuntimeSequencePass.py`, `.../__init__.py`	`MLIRObjectFifoPass` derives tile size, creates BF16 object FIFOs, and declares the external kernel. `MLIRComputeCorePass` emits `aie_d.core` with nested `scf.for` tiling loops and FIFO acquire/release. `MLIRRuntimeSequencePass` injects DMA task configuration for each FIFO.
XDNA2 bindings, tiler, and platform `Deeploy/Targets/XDNA2/Bindings.py`, `Deeploy/Targets/XDNA2/Tiler.py`, `Deeploy/Targets/XDNA2/Platform.py`, `Deeploy/Targets/XDNA2/Parsers.py`	`XDNA2Transformer` chains the three passes into an `MLIRCodeTransformation`. `XDNA2AddBindings` and `XDNA2AddTilingReadyBindings` wire checker, template, and transformer. `XDNA2Engine`, `XDNA2AIECoreEngine`, `XDNA2Platform`, `MemoryXDNA2Platform`, and `MemoryXDNA2PlatformWrapper` complete the platform with buffer classes and memory-level routing.
XDNA2Deployer `Deeploy/Targets/XDNA2/Deployer.py`	`XDNA2Deployer.generateMLIR()` validates bound layers, creates `MLIRExecutionBlock`s with fixed tile coordinates, runs device passes inside `@aie_d.device(npu2)`, emits a `@aiex_d.runtime_sequence` with per-block DMA args, verifies the module, and returns MLIR text.
AIE BF16 Add kernel and CMake build system `TargetLibraries/XDNA2/kernels/add.cc`, `TargetLibraries/XDNA2/CMakeLists.txt`, `CMakeLists.txt`	Vectorized `eltwise_add_bf16_vector` AIE2p kernel with 16-element vector loop and scalar remainder. `TargetLibraries/XDNA2/CMakeLists.txt` resolves `LLVM-AIE`/`MLIR-AIE` paths via `aie.utils.config` and cross-compiles `.cc` files to AIE2p objects. Root `CMakeLists.txt` gains the XDNA2 platform branch.
C++ XRT testbench, testbench build, and network generation `DeeployTest/Platforms/XDNA2/main.cpp`, `DeeployTest/Platforms/XDNA2/CMakeLists.txt`, `DeeployTest/generateNetwork_xdna2.py`, `DeeployTest/deeployRunner_xdna2.py`	`main.cpp` loads xclbin/npu_insts, allocates BF16 XRT buffers, runs the `MLIR_AIE` kernel, and validates output with BF16 ULP tolerance. Its CMake file runs `aiecc.py` to produce xclbin and builds the XRT host binary. `generateNetwork_xdna2.py` generates `testinputs.h`/`testoutputs.h` and `network.mlir` via the XDNA2 deployer pipeline.
Test framework integration and CI `DeeployTest/testUtils/platformMapping.py`, `DeeployTest/testUtils/testRunner.py`, `DeeployTest/testUtils/deeployRunner.py`, `DeeployTest/testUtils/core/execution.py`, `DeeployTest/conftest.py`, `DeeployTest/test_platforms.py`, `DeeployTest/test_xdna2_config.py`, `.github/workflows/*`, `Container/Dockerfile.deeploy-xdna`, `requirements-xdna.txt`, `README_XDNA.md`, `.gitignore`, `CHANGELOG.md`	`platformMapping.py` registers XDNA2 as a non-signprop platform and wires `XDNA2Deployer`. `testRunner.py`/`deeployRunner.py` gain `--l1`/`--l3` tiling args and `xdna2` platform key. `execution.py` routes to `generateNetwork_xdna2.py` and propagates `-v`. `test_platforms.py` adds `test_xdna2_kernels`. CI adds reusable `_runner-xdna2.yml`, `ci-platform-xdna2.yml`, and Docker build workflows. Ubuntu 24.04 Dockerfile installs XRT and MLIR-AIE Python packages.

Sequence Diagram(s)

sequenceDiagram
  participant CI as GitHub Actions CI
  participant Docker as deeploy-xdna Container
  participant PyDeployer as XDNA2Deployer
  participant MLIRPasses as CodeTransformationPasses
  participant aiecc as aiecc.py
  participant XRT as XRT Runtime / NPU

  CI->>Docker: docker run --device /dev/accel/accel0
  Docker->>PyDeployer: generateNetworkXDNA2(args)
  PyDeployer->>MLIRPasses: applyDevicePasses (ObjectFifo → ComputeCore)
  MLIRPasses-->>PyDeployer: MLIRExecutionBlock with fifoMap, tileSize
  PyDeployer->>MLIRPasses: applyRuntimeSequencePasses (RuntimeSequence)
  MLIRPasses-->>PyDeployer: DMA task descriptors embedded in MLIR
  PyDeployer-->>Docker: network.mlir
  Docker->>aiecc: aiecc.py network.mlir → network.xclbin + npu_insts.bin
  aiecc-->>Docker: xclbin artifacts
  Docker->>XRT: xrt::device + load xclbin + run MLIR_AIE kernel
  XRT-->>Docker: output BF16 buffer
  Docker->>Docker: bf16_nearly_equal comparison vs golden
  Docker-->>CI: Errors: 0 out of N

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

pulp-platform/Deeploy#178: Modifies DeeployTest/testUtils/core/execution.py in the same generate_network/run_simulation functions that this PR also changes for XDNA2 platform routing and host verbosity.

Poem

🐰 A bunny hops onto the NPU stage,
BF16 additions now part of the page!
Object FIFOs flow, tile loops spin,
MLIR dialects dancing within.
aiecc.py compiles, XRT takes flight —
XDNA2 deploys with vectorized might! ✨

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (16)

DeeployTest/testUtils/testRunner.py (1)

95-114: Validate tiling CLI inputs at parse-time.

--l1/--l3 currently accept non-positive values, and --defaultMemLevel accepts arbitrary strings. Adding argument-level validation avoids late failures in generation passes.

Suggested refactor

+def _positive_int(value: str) -> int:
+    ivalue = int(value)
+    if ivalue <= 0:
+        raise argparse.ArgumentTypeError("Value must be a positive integer")
+    return ivalue
+
 class TestGeneratorArgumentParser(argparse.ArgumentParser):
@@
             self.add_argument('--l1',
                               metavar = '<size>',
                               dest = 'l1',
-                              type = int,
+                              type = _positive_int,
                               default = None,
                               help = 'Set L1 memory size in bytes (enables tiling if specified).\n')
             self.add_argument('--l3',
                               metavar = '<size>',
                               dest = 'l3',
-                              type = int,
+                              type = _positive_int,
                               default = None,
                               help = 'Set L3 memory size in bytes.\n')
             self.add_argument('--defaultMemLevel',
                               metavar = '<level>',
                               dest = 'defaultMemLevel',
                               type = str,
+                              choices = ["L1", "L2", "L3"],
                               default = "L3",
                               help = 'Set default memory level (default: L3)\n')

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@DeeployTest/testUtils/testRunner.py` around lines 95 - 114, The CLI accepts
invalid tiling inputs; validate them at argument-parse time by replacing the raw
int/str types with validation helpers: for '--l1' and '--l3' use a
positive-integer validator (e.g., positive_int) that raises
argparse.ArgumentTypeError for <=0 values, and for '--defaultMemLevel' restrict
values via choices (e.g., ['L1','L3']) or a validator that normalizes and
rejects unknown values; update the self.add_argument calls for l1, l3, and
defaultMemLevel in testRunner.py accordingly so parsing fails fast with clear
messages.

DeeployTest/test_platforms.py (1)

128-150: Consider updating the markers summary to include xdna2.

The markers summary comment documents available platform markers but doesn't include the newly added xdna2 marker.

Suggested addition

 #   snitch: tests from the Snitch platform (untiled)
 #   snitch_tiled: tests from the Snitch platform (tiled)
 #   siracusa: tests from the Siracusa platform (untiled)
 #   siracusa_tiled: tests from the Siracusa platform (tiled)
 #   siracusa_neureka_tiled: tests from the Siracusa + Neureka platform (tiled)
 #   gap9: tests from the GAP9 platform (untiled)
 #   gap9_tiled: tests from the GAP9 platform (tiled)
+#   xdna2: tests from the XDNA2 platform

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@DeeployTest/test_platforms.py` around lines 128 - 150, Update the markers
summary comment in the Platform markers section of DeeployTest/test_platforms.py
to include the new platform marker `xdna2`; specifically add `xdna2: tests from
the XDNA2 platform` (or a similarly descriptive phrase) alongside the existing
platform marker lines so the comment accurately documents the new `xdna2`
marker.

CMakeLists.txt (1)

23-24: Add "XDNA2" to the platform property strings list.

The XDNA2 platform is handled in the build logic (lines 49-50 and 304-317), but it's not included in the set_property(CACHE platform PROPERTY STRINGS ...) list. This affects CMake GUI/ccmake users who rely on the dropdown for valid platform options.

Proposed fix

 set(platform MemPool CACHE STRING "Platform (MemPool, SoftHier, QEMU, Siracusa, Siracusa_w_neureka, PULP-Open, GAP9, Generic, Snitch)")
-set_property(CACHE platform PROPERTY STRINGS MemPool SoftHier QEMU Siracusa Siracusa_w_neureka PULP-Open GAP9 Generic Snitch)
+set_property(CACHE platform PROPERTY STRINGS MemPool SoftHier QEMU Siracusa Siracusa_w_neureka PULP-Open GAP9 Generic Snitch XDNA2)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@CMakeLists.txt` around lines 23 - 24, The CMake cache property list for the
platform option is missing "XDNA2", causing GUI/ccmake users to not see it as a
valid choice; update the set_property(CACHE platform PROPERTY STRINGS ...) call
to include "XDNA2" (and keep it consistent with the initial set(platform ...)
default if you want it selectable there too) so the platform dropdown contains
XDNA2 alongside MemPool, SoftHier, QEMU, Siracusa, Siracusa_w_neureka,
PULP-Open, GAP9, Generic, and Snitch.

Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py (1)

5-7: Consider explicit imports for better IDE support and static analysis.

While wildcard imports in __init__.py are a common re-export pattern, explicit imports with __all__ provide better discoverability and make the public API explicit. This is a minor maintainability suggestion.

Optional: Explicit imports alternative

-from Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRComputeCorePass import *
-from Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRObjectFifoPass import *
-from Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRRuntimeSequencePass import *
+from Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRComputeCorePass import MLIRComputeCorePass
+from Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRObjectFifoPass import MLIRObjectFifoPass
+from Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRRuntimeSequencePass import MLIRRuntimeSequencePass
+
+__all__ = ['MLIRComputeCorePass', 'MLIRObjectFifoPass', 'MLIRRuntimeSequencePass']

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py` around lines 5 -
7, Replace the three wildcard imports in __init__.py with explicit imports of
the public symbols exported by MLIRComputeCorePass, MLIRObjectFifoPass, and
MLIRRuntimeSequencePass (e.g., import the specific classes/functions you intend
to re-export) and then define an explicit __all__ list containing those symbol
names; update the import lines referencing
Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRComputeCorePass,
MLIRObjectFifoPass, and MLIRRuntimeSequencePass to import only the public
symbols and set __all__ = [...names...] so IDEs and static analyzers can
discover the package API.

DeeployTest/testUtils/platformMapping.py (1)

286-287: Chain the exception for better debugging.

When re-raising an exception, use from to preserve the original traceback. This helps with debugging when the import fails for unexpected reasons.

🔧 Proposed fix

-        except ImportError:
-            raise RuntimeError(f"Deployer for platform {platform} is not implemented")
+        except ImportError as err:
+            raise RuntimeError(f"Deployer for platform {platform} is not implemented") from err

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@DeeployTest/testUtils/platformMapping.py` around lines 286 - 287, The except
ImportError handler in the function that loads deployers currently raises
RuntimeError(f"Deployer for platform {platform} is not implemented") without
chaining the original ImportError; modify the except block to capture the
ImportError (e.g., except ImportError as e:) and re-raise the RuntimeError using
exception chaining (raise RuntimeError(f"Deployer for platform {platform} is not
implemented") from e) so the original traceback is preserved for debugging.

Container/Dockerfile.deeploy-xdna (2)

14-35: Consider adding --no-install-recommends to reduce image size.

Adding --no-install-recommends can significantly reduce the Docker image size by avoiding installation of suggested but non-essential packages.

📦 Proposed fix

 RUN apt-get update && apt-get install -y \
+    --no-install-recommends \
     software-properties-common \
     && add-apt-repository -y ppa:amd-team/xrt \
-    && apt-get update && apt-get install -y \
+    && apt-get update && apt-get install -y --no-install-recommends \
     cmake \

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Container/Dockerfile.deeploy-xdna` around lines 14 - 35, Update the RUN layer
that invokes apt-get to add --no-install-recommends to both apt-get install -y
invocations in the multi-step RUN command (the line starting with "RUN apt-get
update && apt-get install -y \" and the subsequent "&& apt-get update && apt-get
install -y \" sequence) so that unnecessary recommended packages are not
installed and image size is reduced; keep the existing rm -rf
/var/lib/apt/lists/* cleanup as-is.

12-12: Clarify the purpose of LLVM_INSTALL_DIR="nope".

This environment variable with a placeholder value may cause confusion. Consider adding a comment explaining why this is set or whether it's a workaround for a specific issue.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Container/Dockerfile.deeploy-xdna` at line 12, The ENV line setting
LLVM_INSTALL_DIR="nope" is ambiguous; update the Dockerfile to either set a
meaningful path or add an inline comment explaining why "nope" is used (e.g., as
a deliberate sentinel to disable LLVM lookup or to work around a specific build
issue). Locate the ENV LLVM_INSTALL_DIR declaration and either replace "nope"
with the correct install path or append a short comment describing the reason
and any related workarounds or downstream expectations so future maintainers
aren’t confused.

README_XDNA.md (2)

21-21: Consider clarifying the temporary IRON dependency.

This line references a specific developer's path (/scratch/jungvi/IRON) and an internal workaround. Consider either removing this line for public documentation or adding context about when this workaround is needed vs. the standard flow.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README_XDNA.md` at line 21, Update the README_XDNA.md sentence that embeds a
personal developer path by either removing the specific `/scratch/jungvi/IRON`
example or replacing it with a generic, documented workaround: describe that
mounting an IRON repo is a temporary step only required until Deeploy
midend/backend support MLIR, show a generic volume example like `-v
/path/to/IRON:/opt/IRON`, and explain the corresponding env var
`IRON_OPERATORS_DIR=/opt/IRON/iron/operators` and when users should use it
versus the standard flow; ensure the text clearly labels it as a temporary,
developer-only workaround.
6-8: Add language specifiers to fenced code blocks.

The code blocks are missing language specifiers, which affects syntax highlighting and accessibility. Add bash or shell to the command blocks.
📝 Proposed fix
-```
+```bash
 docker build -f Container/Dockerfile.deeploy-xdna -t deeploy-xdna:local .
 ```
Apply similar changes to other code blocks at lines 11, 24, 38, and 44.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README_XDNA.md` around lines 6 - 8, Add language specifiers (e.g., bash or
shell) to the fenced code blocks that contain shell commands so they get proper
syntax highlighting; for example, change the block around the docker build
command (`docker build -f Container/Dockerfile.deeploy-xdna -t
deeploy-xdna:local .`) to start with ```bash and similarly update the other
command/code blocks referenced (lines with commands at the other snippets) to
use ```bash or ```shell so all shell snippets in README_XDNA.md are consistently
labeled.

DeeployTest/generateNetwork_xdna2.py (2)

191-196: Hardcoded Add operation limits reusability.

The golden output computation hardcodes elementwise addition, which will break for other operations. The TODO comment acknowledges this should be moved to ONNX4Deeploy.

Consider either:

Running ONNX inference with BF16-quantized inputs to compute operation-agnostic golden outputs

Extracting the operation type from the graph and dispatching to the appropriate NumPy function

Would you like me to help design a more generic golden output computation approach?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DeeployTest/generateNetwork_xdna2.py` around lines 191 - 196, The
golden-output computation currently hardcodes elementwise addition using
bf16_inputs -> bf16_input_f32 -> golden_f32 and will fail for non-add ops;
replace this with an operation-agnostic approach: either run the ONNX graph
(using the BF16-quantized inputs bf16_inputs) to produce test_outputs_bf16, or
inspect the model graph to extract the root operator(s) and dispatch to the
corresponding NumPy routine (e.g., use op_type from nodes and call the
appropriate handler) so that test_outputs_bf16 is produced generically instead
of summing elements.
129-137: Fix unused loop variable and add strict=True for safety.

The name variable is unused, and zip() without strict= can silently produce incorrect results if the iterables have different lengths.
🔧 Proposed fix
-    for index, (name, values) in enumerate(zip(inputs_npz.files, test_inputs_f32)):
+    for index, (_name, values) in enumerate(zip(inputs_npz.files, test_inputs_f32, strict=True)):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DeeployTest/generateNetwork_xdna2.py` around lines 129 - 137, In the loop
that currently reads "for index, (name, values) in
enumerate(zip(inputs_npz.files, test_inputs_f32))", mark the unused filename
variable as _ and make zip strict to avoid length-mismatch bugs: change the
header to "for index, (_, values) in enumerate(zip(inputs_npz.files,
test_inputs_f32, strict=True))" so you still populate inputTypes/inputOffsets
with PointerClass(bfloat16_t) as before; no other logic changes needed.

Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py (1)

47-47: Use next(iter(...)) instead of list(...)[0].

Per static analysis suggestion, this is more efficient as it avoids creating an intermediate list.

♻️ Proposed fix

-                firstOutputName = list(outputConstraints.keys())[0]
+                firstOutputName = next(iter(outputConstraints.keys()))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py` at line
47, Replace the inefficient list(...) indexing when obtaining the first key from
outputConstraints: locate the assignment to firstOutputName in
MLIRObjectFifoPass.py and use next(iter(outputConstraints)) (or
next(iter(outputConstraints.keys()))) instead of
list(outputConstraints.keys())[0] so you avoid creating an intermediate list;
update the code where firstOutputName is set accordingly.

Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py (1)

68-88: Assumes uniform tile type for all input/output tensors.

Line 70 uses firstKey's type as tileTy, and line 74 creates a single subviewTy used for all FIFO acquisitions. This works for the current Add operation where all tensors have identical shapes and types, but will need generalization for operators with mixed tensor types.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py` around
lines 68 - 88, The code assumes a single representative tile type (tileTy
derived from firstKey) and reuses one subviewTy for all FIFO acquisitions, which
breaks when tensors have different types; update the loop so you look up the
per-key memref type via mlirBlock.fifoTypes[key] and create a matching subview
type for each acquisition (i.e., compute subviewTyPerKey from
mlirBlock.fifoTypes[key] before calling
aie_d.objectfifo_acquire/objectfifo_subview_access), handling inputs
(self.inputTensorKeys) and outputs (self.outputTensorKeys) independently rather
than reusing tileTy/subviewTy for every key.

Deeploy/Targets/XDNA2/Platform.py (1)

126-138: Duplicated getTargetMemoryLevel logic.

Both MemoryXDNA2Platform and MemoryXDNA2PlatformWrapper implement identical logic for checking _engine_assignment. Consider extracting to a shared helper function to maintain consistency.

♻️ Proposed refactor

def _get_xdna2_target_memory_level(node: gs.Node, default_level_name: str) -> str:
    """Shared logic for XDNA2 memory level selection."""
    if hasattr(node, '_engine_assignment'):
        engine = node._engine_assignment
        if isinstance(engine, XDNA2AIECoreEngine) and hasattr(engine, 'preferredMemoryLevel'):
            return engine.preferredMemoryLevel
    return default_level_name

Then both classes can call this helper in their getTargetMemoryLevel methods.

Also applies to: 150-157

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Deeploy/Targets/XDNA2/Platform.py` around lines 126 - 138, Extract the
duplicated engine-assignment logic from both
MemoryXDNA2Platform.getTargetMemoryLevel and
MemoryXDNA2PlatformWrapper.getTargetMemoryLevel into a shared helper (e.g.,
_get_xdna2_target_memory_level) that accepts the node and the default level
name; have each getTargetMemoryLevel call this helper instead of repeating the
hasattr/instanceof checks against XDNA2AIECoreEngine and preferredMemoryLevel,
returning engine.preferredMemoryLevel when present or
defaultTargetMemoryLevel.name otherwise.

Deeploy/Targets/XDNA2/Deployer.py (1)

159-171: Tensor type derived from first node only.

Line 163 derives tensorTy from firstEb.numElements and hardcodes BF16Type. This assumes all runtime sequence arguments have the same element count and type. For the current Add-only scope this is fine, but consider documenting this assumption or adding validation when extending to multi-node graphs with varying tensor sizes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Deeploy/Targets/XDNA2/Deployer.py` around lines 159 - 171, The code derives
tensorTy from the first block (mlirBlocks[0]) using firstEb.numElements and a
hardcoded ir.BF16Type, which assumes every runtime-sequence arg shares the same
element count and type; update the runtime-sequence setup in Deployer.py to
validate that all blocks in mlirBlocks have matching numElements (and, if
applicable, matching element types) before creating tensorTy (or document this
assumption clearly), and emit a clear error or fallback if a mismatch is
detected so aiex_d.runtime_sequence(_seq) is not created with inconsistent
tensor shapes.

DeeployTest/Platforms/XDNA2/CMakeLists.txt (1)

29-69: LGTM! Toolchain resolution is well-structured.

The fallback chain (cache → environment → Python query) for LLVM_AIE_INSTALL_DIR and MLIR_AIE_INSTALL_DIR provides good flexibility.

One minor inconsistency: XRT_INSTALL_DIR isn't declared as a cache variable like the others. Consider adding CACHE PATH for consistency:
-if(NOT XRT_INSTALL_DIR)
+set(XRT_INSTALL_DIR "" CACHE PATH "XRT install dir")
+if(NOT XRT_INSTALL_DIR)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DeeployTest/Platforms/XDNA2/CMakeLists.txt` around lines 29 - 69, The
XRT_INSTALL_DIR variable should be declared as a cache variable like
LLVM_AIE_INSTALL_DIR and MLIR_AIE_INSTALL_DIR for consistent configuration and
overrides; change the declaration so XRT_INSTALL_DIR is initialized with
set(XRT_INSTALL_DIR "$ENV{XRT_INSTALL_DIR}" CACHE PATH "XRT install dir") (or
similar) and preserve the existing fallback logic in the if(NOT XRT_INSTALL_DIR)
block that checks ENV{XILINX_XRT} and defaults to "/opt/xilinx/xrt", ensuring
the cache variable is used/checked instead of an undeclared plain variable.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Deeploy/MLIRDataTypes.py`:
- Around line 187-201: The generate method uses a mutable default
operatorRepresentation = {} which can lead to shared-state bugs; change the
signature of generate (the method named generate in MLIRDataTypes.py) to accept
operatorRepresentation = None (or Optional[Dict]) and inside the method do a
guarded assignment like "if operatorRepresentation is None:
operatorRepresentation = {}" before using it, ensuring you preserve the existing
behavior and return value while avoiding the mutable default argument.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py`:
- Around line 85-89: The current sequence awaits outputTasks but never frees
them and frees inputTasks before ensuring completion; change the order so both
inputTasks and outputTasks are awaited before calling aiex_d.dma_free_task on
them. Specifically, use aiex_d.dma_await_task(...) for inputTasks as well (or
otherwise ensure their completion), then call aiex_d.dma_free_task(...) on each
input task, and after awaiting each output task call aiex_d.dma_free_task(...)
on each output task; update the block handling outputTasks/inputTasks (and
consider the origin of issue_token) so freeing only happens after dma_await_task
confirms completion.

In `@DeeployTest/Platforms/XDNA2/main.cpp`:
- Around line 79-112: Wrap the main logic of main (the code that calls
read_instr_binary(), xrt::device, xrt::xclbin, device.register_xclbin,
xrt::hw_context and xrt::kernel) in a try-catch block that catches
std::exception (and optionally a catch-all) and prints a clear error message
including ex.what() to stderr (or via your logging mechanism) and returns a
non-zero exit code; ensure any thrown exceptions from read_instr_binary() or XRT
constructors are handled so the program exits gracefully with an informative
message.
- Around line 65-77: In read_instr_binary, validate the file size and read
result: use std::streampos/istream::pos_type for tellg, check for error (e.g.,
tellg()==-1) and throw if so, verify byte_size is a multiple of sizeof(uint32_t)
and throw a descriptive std::runtime_error if not aligned, and after file.read
confirm gcount/streams state to ensure the full byte_size was read and throw on
short reads or IO errors; update uses of byte_size to a safely cast size_t only
after validation.

In `@DeeployTest/testUtils/core/execution.py`:
- Around line 171-173: The current code appends "-v" whenever config.verbose >=
1 which breaks the Generic/host binary that doesn't accept flags; change the
condition to only add "-v" when the run is for the XDNA2 binary (e.g. check the
simulator/platform identifier used in this module — e.g. simulator == "XDNA2" or
platform == "XDNA2") so replace the existing guard with something like: only
append "-v" when config.verbose >= 1 AND the simulator/platform equals the XDNA2
identifier (leave the cmd and config.verbose symbols intact and reference
XDNA2/main.cpp as the binary that accepts -v).

In `@requirements-xdna.txt`:
- Around line 6-10: requirements-xdna.txt currently lists llvm-aie without a
version, causing nondeterministic installs from the nightly index; update the
dependency by replacing the unpinned "llvm-aie" entry with a specific pinned
version compatible with mlir_aie (for example
"llvm-aie==18.0.0.2024050100+59d494d3" or a released tag you verified), so the
file contains the exact version string for llvm-aie alongside mlir_aie==v1.2.1
to ensure reproducible, compatible builds.

In `@TargetLibraries/XDNA2/kernels/add.cc`:
- Around line 20-40: The vector loop in eltwise_vadd uses F = size / vec_factor
and drops trailing elements when size isn't a multiple of vec_factor; add a
scalar tail loop after the existing vector loop that processes indices from F *
vec_factor up to size, performing element-wise addition using the same types
(T_in/T_out) and either the pointer cursors (pA1/pB1/pC1) or direct indexing
(a[i], b[i], c[i]) so all remaining elements are handled.

---

Nitpick comments:
In `@CMakeLists.txt`:
- Around line 23-24: The CMake cache property list for the platform option is
missing "XDNA2", causing GUI/ccmake users to not see it as a valid choice;
update the set_property(CACHE platform PROPERTY STRINGS ...) call to include
"XDNA2" (and keep it consistent with the initial set(platform ...) default if
you want it selectable there too) so the platform dropdown contains XDNA2
alongside MemPool, SoftHier, QEMU, Siracusa, Siracusa_w_neureka, PULP-Open,
GAP9, Generic, and Snitch.

In `@Container/Dockerfile.deeploy-xdna`:
- Around line 14-35: Update the RUN layer that invokes apt-get to add
--no-install-recommends to both apt-get install -y invocations in the multi-step
RUN command (the line starting with "RUN apt-get update && apt-get install -y \"
and the subsequent "&& apt-get update && apt-get install -y \" sequence) so that
unnecessary recommended packages are not installed and image size is reduced;
keep the existing rm -rf /var/lib/apt/lists/* cleanup as-is.
- Line 12: The ENV line setting LLVM_INSTALL_DIR="nope" is ambiguous; update the
Dockerfile to either set a meaningful path or add an inline comment explaining
why "nope" is used (e.g., as a deliberate sentinel to disable LLVM lookup or to
work around a specific build issue). Locate the ENV LLVM_INSTALL_DIR declaration
and either replace "nope" with the correct install path or append a short
comment describing the reason and any related workarounds or downstream
expectations so future maintainers aren’t confused.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py`:
- Around line 5-7: Replace the three wildcard imports in __init__.py with
explicit imports of the public symbols exported by MLIRComputeCorePass,
MLIRObjectFifoPass, and MLIRRuntimeSequencePass (e.g., import the specific
classes/functions you intend to re-export) and then define an explicit __all__
list containing those symbol names; update the import lines referencing
Deeploy.Targets.XDNA2.CodeTransformationPasses.MLIRComputeCorePass,
MLIRObjectFifoPass, and MLIRRuntimeSequencePass to import only the public
symbols and set __all__ = [...names...] so IDEs and static analyzers can
discover the package API.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py`:
- Around line 68-88: The code assumes a single representative tile type (tileTy
derived from firstKey) and reuses one subviewTy for all FIFO acquisitions, which
breaks when tensors have different types; update the loop so you look up the
per-key memref type via mlirBlock.fifoTypes[key] and create a matching subview
type for each acquisition (i.e., compute subviewTyPerKey from
mlirBlock.fifoTypes[key] before calling
aie_d.objectfifo_acquire/objectfifo_subview_access), handling inputs
(self.inputTensorKeys) and outputs (self.outputTensorKeys) independently rather
than reusing tileTy/subviewTy for every key.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py`:
- Line 47: Replace the inefficient list(...) indexing when obtaining the first
key from outputConstraints: locate the assignment to firstOutputName in
MLIRObjectFifoPass.py and use next(iter(outputConstraints)) (or
next(iter(outputConstraints.keys()))) instead of
list(outputConstraints.keys())[0] so you avoid creating an intermediate list;
update the code where firstOutputName is set accordingly.

In `@Deeploy/Targets/XDNA2/Deployer.py`:
- Around line 159-171: The code derives tensorTy from the first block
(mlirBlocks[0]) using firstEb.numElements and a hardcoded ir.BF16Type, which
assumes every runtime-sequence arg shares the same element count and type;
update the runtime-sequence setup in Deployer.py to validate that all blocks in
mlirBlocks have matching numElements (and, if applicable, matching element
types) before creating tensorTy (or document this assumption clearly), and emit
a clear error or fallback if a mismatch is detected so
aiex_d.runtime_sequence(_seq) is not created with inconsistent tensor shapes.

In `@Deeploy/Targets/XDNA2/Platform.py`:
- Around line 126-138: Extract the duplicated engine-assignment logic from both
MemoryXDNA2Platform.getTargetMemoryLevel and
MemoryXDNA2PlatformWrapper.getTargetMemoryLevel into a shared helper (e.g.,
_get_xdna2_target_memory_level) that accepts the node and the default level
name; have each getTargetMemoryLevel call this helper instead of repeating the
hasattr/instanceof checks against XDNA2AIECoreEngine and preferredMemoryLevel,
returning engine.preferredMemoryLevel when present or
defaultTargetMemoryLevel.name otherwise.

In `@DeeployTest/generateNetwork_xdna2.py`:
- Around line 191-196: The golden-output computation currently hardcodes
elementwise addition using bf16_inputs -> bf16_input_f32 -> golden_f32 and will
fail for non-add ops; replace this with an operation-agnostic approach: either
run the ONNX graph (using the BF16-quantized inputs bf16_inputs) to produce
test_outputs_bf16, or inspect the model graph to extract the root operator(s)
and dispatch to the corresponding NumPy routine (e.g., use op_type from nodes
and call the appropriate handler) so that test_outputs_bf16 is produced
generically instead of summing elements.
- Around line 129-137: In the loop that currently reads "for index, (name,
values) in enumerate(zip(inputs_npz.files, test_inputs_f32))", mark the unused
filename variable as _ and make zip strict to avoid length-mismatch bugs: change
the header to "for index, (_, values) in enumerate(zip(inputs_npz.files,
test_inputs_f32, strict=True))" so you still populate inputTypes/inputOffsets
with PointerClass(bfloat16_t) as before; no other logic changes needed.

In `@DeeployTest/Platforms/XDNA2/CMakeLists.txt`:
- Around line 29-69: The XRT_INSTALL_DIR variable should be declared as a cache
variable like LLVM_AIE_INSTALL_DIR and MLIR_AIE_INSTALL_DIR for consistent
configuration and overrides; change the declaration so XRT_INSTALL_DIR is
initialized with set(XRT_INSTALL_DIR "$ENV{XRT_INSTALL_DIR}" CACHE PATH "XRT
install dir") (or similar) and preserve the existing fallback logic in the
if(NOT XRT_INSTALL_DIR) block that checks ENV{XILINX_XRT} and defaults to
"/opt/xilinx/xrt", ensuring the cache variable is used/checked instead of an
undeclared plain variable.

In `@DeeployTest/test_platforms.py`:
- Around line 128-150: Update the markers summary comment in the Platform
markers section of DeeployTest/test_platforms.py to include the new platform
marker `xdna2`; specifically add `xdna2: tests from the XDNA2 platform` (or a
similarly descriptive phrase) alongside the existing platform marker lines so
the comment accurately documents the new `xdna2` marker.

In `@DeeployTest/testUtils/platformMapping.py`:
- Around line 286-287: The except ImportError handler in the function that loads
deployers currently raises RuntimeError(f"Deployer for platform {platform} is
not implemented") without chaining the original ImportError; modify the except
block to capture the ImportError (e.g., except ImportError as e:) and re-raise
the RuntimeError using exception chaining (raise RuntimeError(f"Deployer for
platform {platform} is not implemented") from e) so the original traceback is
preserved for debugging.

In `@DeeployTest/testUtils/testRunner.py`:
- Around line 95-114: The CLI accepts invalid tiling inputs; validate them at
argument-parse time by replacing the raw int/str types with validation helpers:
for '--l1' and '--l3' use a positive-integer validator (e.g., positive_int) that
raises argparse.ArgumentTypeError for <=0 values, and for '--defaultMemLevel'
restrict values via choices (e.g., ['L1','L3']) or a validator that normalizes
and rejects unknown values; update the self.add_argument calls for l1, l3, and
defaultMemLevel in testRunner.py accordingly so parsing fails fast with clear
messages.

In `@README_XDNA.md`:
- Line 21: Update the README_XDNA.md sentence that embeds a personal developer
path by either removing the specific `/scratch/jungvi/IRON` example or replacing
it with a generic, documented workaround: describe that mounting an IRON repo is
a temporary step only required until Deeploy midend/backend support MLIR, show a
generic volume example like `-v /path/to/IRON:/opt/IRON`, and explain the
corresponding env var `IRON_OPERATORS_DIR=/opt/IRON/iron/operators` and when
users should use it versus the standard flow; ensure the text clearly labels it
as a temporary, developer-only workaround.
- Around line 6-8: Add language specifiers (e.g., bash or shell) to the fenced
code blocks that contain shell commands so they get proper syntax highlighting;
for example, change the block around the docker build command (`docker build -f
Container/Dockerfile.deeploy-xdna -t deeploy-xdna:local .`) to start with
```bash and similarly update the other command/code blocks referenced (lines
with commands at the other snippets) to use ```bash or ```shell so all shell
snippets in README_XDNA.md are consistently labeled.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e910d040-c1e1-4193-9705-285039a5ece9

📥 Commits

Reviewing files that changed from the base of the PR and between 09993fb and a82fd52.

📒 Files selected for processing (36)

.github/workflows/_runner-xdna2.yml
.github/workflows/ci-platform-xdna2.yml
.gitignore
CMakeLists.txt
Container/Dockerfile.deeploy-xdna
Deeploy/MLIRDataTypes.py
Deeploy/Targets/XDNA2/Bindings.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py
Deeploy/Targets/XDNA2/Deployer.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Platform.py
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
Deeploy/Targets/XDNA2/Templates/__init__.py
Deeploy/Targets/XDNA2/Tiler.py
Deeploy/Targets/XDNA2/TypeCheckers.py
DeeployTest/Platforms/XDNA2/CMakeLists.txt
DeeployTest/Platforms/XDNA2/main.cpp
DeeployTest/Tests/Kernels/BF16/Add/Regular/inputs.npz
DeeployTest/Tests/Kernels/BF16/Add/Regular/network.onnx
DeeployTest/Tests/Kernels/BF16/Add/Regular/outputs.npz
DeeployTest/conftest.py
DeeployTest/deeployRunner_xdna2.py
DeeployTest/generateNetwork_xdna2.py
DeeployTest/testUtils/core/execution.py
DeeployTest/testUtils/deeployRunner.py
DeeployTest/testUtils/platformMapping.py
DeeployTest/testUtils/testRunner.py
DeeployTest/test_platforms.py
DeeployTest/test_xdna2_config.py
README_XDNA.md
TargetLibraries/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/kernels/add.cc
requirements-xdna.txt

Xeratec

Hi Victor, thanks for this valuable contribution. I have some comments but overall the PR is already in a very good shape.

Xeratec · 2026-05-06T09:37:26Z

+      docker_image:
+        description: "XDNA2 Docker image (must be pre-built on the runner)"
+        required: false
+        default: "deeploy-xdna:local"


Why use the local file and not ghcr.io/pulp-platform/deeploy-xdnba:devel"?

Because right now the CI runs locally on one of my machine and I don't have collaborators. Once this platform becomes more mainstream I think it will be a good idea to publish a docker but until then it's overkill IMO. Especially considering that the XDNA platform has little dependencies.

Hmm, okay, I do not fully agree,' as building the container and making it available to others is very easy. But if you insist, I wont prevent the merge because of this.

Done in 131cc2e

Xeratec · 2026-05-06T09:50:18Z

+
+    def __init__(self,
+                 name: str = "XDNA2_AIE_Core",
+                 Mapping = XDNA2Mapping,


Why does this use XDNA2Mapping and not XDNA2TilingMapping?

This was redundent and is now removed, see 98faf37

coderabbitai

🧹 Nitpick comments (6)

Container/Dockerfile.deeploy-xdna (6)
43-45: ⚡ Quick win

Remove unnecessary /app/build directory creation and cleanup.

Lines 14–15 create /app/build as a WORKDIR, but no build operations actually occur there before it's removed at line 45. This cleanup is pointless and confusing. Either use /app/build for actual build operations (as suggested in past reviews), or skip creating it entirely.
♻️ Proposed fix to remove the unnecessary steps
-WORKDIR /app/build
-
 RUN apt-get update && apt-get install -y \
     software-properties-common \
     && add-apt-repository -y ppa:amd-team/xrt \
-# Remove unused files and clean up to reduce image size
-WORKDIR /app
-RUN rm -rf /app/build
-
 COPY pyproject.toml requirements-xdna.txt ./
And add WORKDIR /app before the COPY command:
+WORKDIR /app
+
 COPY pyproject.toml requirements-xdna.txt ./
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` around lines 43 - 45, Remove the pointless
creation and cleanup of /app/build: delete the WORKDIR /app/build and the RUN rm
-rf /app/build lines (they are the WORKDIR /app/build and RUN rm -rf /app/build
entries) and ensure a single WORKDIR /app is set before the COPY commands so
files are copied into /app directly; alternatively, if you intend to perform
builds in a separate context, convert the created WORKDIR /app/build into the
actual build step instead of removing it.
41-41: 💤 Low value

Append to LD_LIBRARY_PATH instead of replacing it.

The current assignment replaces any existing LD_LIBRARY_PATH value. For better composability with base images or multi-stage builds, append the XRT library path instead.
🔗 Proposed fix to append the path
-ENV LD_LIBRARY_PATH=${XILINX_XRT}/lib
+ENV LD_LIBRARY_PATH=${XILINX_XRT}/lib:${LD_LIBRARY_PATH}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` at line 41, The Dockerfile currently sets
LD_LIBRARY_PATH with ENV LD_LIBRARY_PATH=${XILINX_XRT}/lib which overwrites any
existing LD_LIBRARY_PATH; change it to append the XRT lib path instead so
existing entries are preserved. Update the ENV instruction that references
LD_LIBRARY_PATH and XILINX_XRT to export
LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${XILINX_XRT}/lib (or
equivalent shell-safe append), ensuring it handles an empty original value
correctly and uses the existing LD_LIBRARY_PATH if present.
56-56: ⚖️ Poor tradeoff

Consider adding a non-root USER for security.

The container runs as root by default, which is flagged by static analysis (DS-0002) as a security concern. While root access may be necessary for NPU device access (/dev/accel/accel0) in the CI environment, consider whether a non-root user with appropriate device permissions would work.

If root is required, document why in a comment. Otherwise, add a USER directive:
🔒 Proposed fix to add non-root user
 ENV MLIR_AIE_PYTHON=/usr/bin/python3
 
+# Create non-root user for security
+RUN useradd -m -s /bin/bash deeploy && \
+    chown -R deeploy:deeploy /app
+USER deeploy
+
 WORKDIR /app/Deeploy
Note: If NPU device access requires root, this may not be feasible without additional device permission configuration.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` at line 56, The Dockerfile currently sets
WORKDIR /app/Deeploy and leaves the container running as root; add a non-root
user (e.g., create and chown to a user) and add a USER directive after WORKDIR
so the app runs without root, or if NPU device access truly requires root, add
an explicit comment explaining why root is required and list the device
(/dev/accel/accel0) and CI constraints; update ownership of /app/Deeploy to that
user and ensure any startup scripts or entries in the Dockerfile (where WORKDIR
is used) run under that user.
48-52: ⚡ Quick win

Remove toml-to-requirements after use to reduce image size.

The toml-to-requirements package is installed to generate requirements.txt but is left in the image afterward. Since it's only needed at build time, uninstall it to reduce the final image size.
🗑️ Proposed fix to uninstall build-time dependency
 COPY pyproject.toml requirements-xdna.txt ./
 RUN pip install toml-to-requirements && \
     toml-to-req --toml-file pyproject.toml && \
     pip install -r requirements.txt && \
     pip install -r requirements-xdna.txt && \
+    pip uninstall -y toml-to-requirements && \
     rm -f requirements.txt pyproject.toml requirements-xdna.txt
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` around lines 48 - 52, The Dockerfile
currently installs the build-only package toml-to-requirements and leaves it in
the image; update the RUN sequence that calls pip install toml-to-requirements
and toml-to-req so that after generating requirements.txt and installing the
requirements you uninstall the build tool (e.g., add a pip uninstall -y
toml-to-requirements step after pip install -r requirements.txt && pip install
-r requirements-xdna.txt) and then remove the temporary files (requirements.txt,
pyproject.toml, requirements-xdna.txt) so the build-time dependency is not
retained in the final image.
16-37: ⚡ Quick win

Add --no-install-recommends to reduce image size.

The apt-get install commands on lines 19 and 19 (second invocation) are missing the --no-install-recommends flag, which causes APT to install recommended packages and increases the image size unnecessarily.
📦 Proposed fix to add the flag
 RUN apt-get update && apt-get install -y \
+    --no-install-recommends \
     software-properties-common \
     && add-apt-repository -y ppa:amd-team/xrt \
     && apt-get update && apt-get install -y \
+    --no-install-recommends \
     cmake \
     ninja-build \
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` around lines 16 - 37, Update the two
apt-get install invocations in the RUN instruction so they include the
--no-install-recommends flag to avoid pulling recommended packages and reduce
image size; modify the RUN block that calls "apt-get install -y
software-properties-common" and the subsequent "apt-get install -y cmake
ninja-build g++ ..." to add --no-install-recommends to each apt-get install
command and keep the existing && rm -rf /var/lib/apt/lists/* cleanup unchanged.
12-12: 💤 Low value

Replace LLVM_INSTALL_DIR="nope" with a valid value or remove it.

The environment variable LLVM_INSTALL_DIR is used by the test suite to locate the toolchain installation directory. Setting it to "nope" is not a valid path and will cause test failures when the code attempts to access files in this directory. Either leave it unset (allowing tests to skip gracefully) or set it to the actual path where the LLVM toolchain is installed.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` at line 12, The Dockerfile sets ENV
LLVM_INSTALL_DIR="nope", which is an invalid path and breaks tests; update the
ENV LLVM_INSTALL_DIR entry to either remove it entirely or set it to a valid
LLVM toolchain install path (replace the value for LLVM_INSTALL_DIR with the
actual installation directory used in your environment or omit the ENV line so
tests can skip), ensuring any references to LLVM_INSTALL_DIR in the build/test
scripts will resolve correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Container/Dockerfile.deeploy-xdna`:
- Around line 43-45: Remove the pointless creation and cleanup of /app/build:
delete the WORKDIR /app/build and the RUN rm -rf /app/build lines (they are the
WORKDIR /app/build and RUN rm -rf /app/build entries) and ensure a single
WORKDIR /app is set before the COPY commands so files are copied into /app
directly; alternatively, if you intend to perform builds in a separate context,
convert the created WORKDIR /app/build into the actual build step instead of
removing it.
- Line 41: The Dockerfile currently sets LD_LIBRARY_PATH with ENV
LD_LIBRARY_PATH=${XILINX_XRT}/lib which overwrites any existing LD_LIBRARY_PATH;
change it to append the XRT lib path instead so existing entries are preserved.
Update the ENV instruction that references LD_LIBRARY_PATH and XILINX_XRT to
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${XILINX_XRT}/lib
(or equivalent shell-safe append), ensuring it handles an empty original value
correctly and uses the existing LD_LIBRARY_PATH if present.
- Line 56: The Dockerfile currently sets WORKDIR /app/Deeploy and leaves the
container running as root; add a non-root user (e.g., create and chown to a
user) and add a USER directive after WORKDIR so the app runs without root, or if
NPU device access truly requires root, add an explicit comment explaining why
root is required and list the device (/dev/accel/accel0) and CI constraints;
update ownership of /app/Deeploy to that user and ensure any startup scripts or
entries in the Dockerfile (where WORKDIR is used) run under that user.
- Around line 48-52: The Dockerfile currently installs the build-only package
toml-to-requirements and leaves it in the image; update the RUN sequence that
calls pip install toml-to-requirements and toml-to-req so that after generating
requirements.txt and installing the requirements you uninstall the build tool
(e.g., add a pip uninstall -y toml-to-requirements step after pip install -r
requirements.txt && pip install -r requirements-xdna.txt) and then remove the
temporary files (requirements.txt, pyproject.toml, requirements-xdna.txt) so the
build-time dependency is not retained in the final image.
- Around line 16-37: Update the two apt-get install invocations in the RUN
instruction so they include the --no-install-recommends flag to avoid pulling
recommended packages and reduce image size; modify the RUN block that calls
"apt-get install -y software-properties-common" and the subsequent "apt-get
install -y cmake ninja-build g++ ..." to add --no-install-recommends to each
apt-get install command and keep the existing && rm -rf /var/lib/apt/lists/*
cleanup unchanged.
- Line 12: The Dockerfile sets ENV LLVM_INSTALL_DIR="nope", which is an invalid
path and breaks tests; update the ENV LLVM_INSTALL_DIR entry to either remove it
entirely or set it to a valid LLVM toolchain install path (replace the value for
LLVM_INSTALL_DIR with the actual installation directory used in your environment
or omit the ENV line so tests can skip), ensuring any references to
LLVM_INSTALL_DIR in the build/test scripts will resolve correctly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 980abba0-cc9f-4991-a904-aa73c9587cd5

📥 Commits

Reviewing files that changed from the base of the PR and between a82fd52 and 7e8736e.

📒 Files selected for processing (5)

Container/Dockerfile.deeploy-xdna
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
DeeployTest/testUtils/core/execution.py
TargetLibraries/XDNA2/kernels/add.cc
requirements-xdna.txt

✅ Files skipped from review due to trivial changes (1)

requirements-xdna.txt

🚧 Files skipped from review as they are similar to previous changes (2)

DeeployTest/testUtils/core/execution.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/_runner-xdna2.yml:
- Around line 31-32: Validate and safely quote workflow inputs before they are
used: reject or fail fast if inputs.docker-image or inputs.pytest-marker contain
unsafe characters (eg. leading '-' or shell metacharacters) by testing them
against a strict whitelist regex (alphanumeric, ./:_@- and optional tag), then
use quoted interpolations everywhere they are used (e.g. "${{
inputs.docker-image }}" in the docker run/exec command and "${{
inputs.pytest-marker }}" when calling pytest inside bash) and avoid passing
unvalidated values directly into shell or docker arguments (also ensure any
chown or other shell lines use quoted paths and &&/|| chaining safely instead of
relying on interpolation). Ensure the validation step is implemented early in
the job and fails the run with a clear error if inputs do not match the allowed
pattern.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 929e79f9-1673-41e0-abe3-b43f9e0231ed

📥 Commits

Reviewing files that changed from the base of the PR and between 7e8736e and acd5680.

📒 Files selected for processing (1)

.github/workflows/_runner-xdna2.yml

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (3)

TargetLibraries/XDNA2/CMakeLists.txt (1)

29-29: ⚡ Quick win

Add CONFIGURE_DEPENDS to file(GLOB ...) to avoid stale builds when new kernels are added.

Without CONFIGURE_DEPENDS, adding a new *.cc to kernels/ won't trigger a CMake reconfigure, silently omitting the new kernel from the build.

♻️ Proposed fix

-file(GLOB XDNA2_KERNEL_SOURCES "${CMAKE_CURRENT_LIST_DIR}/kernels/*.cc")
+file(GLOB XDNA2_KERNEL_SOURCES CONFIGURE_DEPENDS "${CMAKE_CURRENT_LIST_DIR}/kernels/*.cc")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@TargetLibraries/XDNA2/CMakeLists.txt` at line 29, The file(GLOB ...)
invocation that defines XDNA2_KERNEL_SOURCES should be updated to include the
CONFIGURE_DEPENDS option so CMake will re-run configuration when files under
kernels/ change; locate the file(GLOB XDNA2_KERNEL_SOURCES
"${CMAKE_CURRENT_LIST_DIR}/kernels/*.cc") line in CMakeLists.txt and add
CONFIGURE_DEPENDS to that file(GLOB) call to ensure new *.cc kernel files are
detected and the build is not silently stale.

Deeploy/Targets/XDNA2/Platform.py (1)

118-149: ⚡ Quick win

getTargetMemoryLevel is duplicated verbatim in both MemoryXDNA2Platform and MemoryXDNA2PlatformWrapper.

Extract the shared logic to a module-level helper. The hasattr(engine, 'preferredMemoryLevel') guard is also always True for XDNA2AIECoreEngine (it's set unconditionally in __init__), so it can be dropped.

♻️ Proposed refactor

+def _xdna2_target_memory_level(node: gs.Node, default: MemoryLevel) -> str:
+    """Return the AIE core's preferred memory level, or the platform default."""
+    if hasattr(node, '_engine_assignment'):
+        engine = node._engine_assignment
+        if isinstance(engine, XDNA2AIECoreEngine):
+            return engine.preferredMemoryLevel
+    return default.name
+

 class MemoryXDNA2Platform(MemoryPlatform):
     ...
     def getTargetMemoryLevel(self, node: gs.Node, tensorName: str, ctxt: NetworkContext) -> str:
-        if hasattr(node, '_engine_assignment'):
-            engine = node._engine_assignment
-            if isinstance(engine, XDNA2AIECoreEngine) and hasattr(engine, 'preferredMemoryLevel'):
-                return engine.preferredMemoryLevel
-        return self.defaultTargetMemoryLevel.name
+        return _xdna2_target_memory_level(node, self.defaultTargetMemoryLevel)


 class MemoryXDNA2PlatformWrapper(MemoryPlatformWrapper):
     ...
     def getTargetMemoryLevel(self, node: gs.Node, tensorName: str, ctxt: NetworkContext) -> str:
-        if hasattr(node, '_engine_assignment'):
-            engine = node._engine_assignment
-            if isinstance(engine, XDNA2AIECoreEngine) and hasattr(engine, 'preferredMemoryLevel'):
-                return engine.preferredMemoryLevel
-        return self.defaultTargetMemoryLevel.name
+        return _xdna2_target_memory_level(node, self.defaultTargetMemoryLevel)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Deeploy/Targets/XDNA2/Platform.py` around lines 118 - 149, Both
MemoryXDNA2Platform.getTargetMemoryLevel and
MemoryXDNA2PlatformWrapper.getTargetMemoryLevel duplicate the same logic;
extract that shared logic into a module-level helper (e.g.,
resolve_target_memory_level(engine, default_level)) and have both methods call
it. In the helper use isinstance(engine, XDNA2AIECoreEngine) to return
engine.preferredMemoryLevel and otherwise return default_level.name; drop the
redundant hasattr(engine, 'preferredMemoryLevel') guard because
XDNA2AIECoreEngine always defines it. Update references in getTargetMemoryLevel
in both classes to call the new helper and remove the duplicated code.

DeeployTest/generateNetwork_xdna2.py (1)

58-111: ⚡ Quick win

_generate_xdna2_inputs_header and _generate_xdna2_outputs_header are near-identical.

The only differences are the "Input"/"Output" labels and "input"/"output" macro suffixes. Merging them into one helper avoids duplicate bug-surface.

♻️ Proposed refactor

-def _generate_xdna2_inputs_header(input_arrays: list) -> str:
-    """Generate testinputs.h with raw uint16_t BF16 bit-pattern arrays."""
-    ...
-
-def _generate_xdna2_outputs_header(output_arrays: list) -> str:
-    """Generate testoutputs.h with raw uint16_t BF16 bit-pattern arrays."""
-    ...
+def _generate_xdna2_tensor_header(arrays: list, kind: str) -> str:
+    """Generate a BF16 uint16_t header for *kind* ('input' or 'output')."""
+    Kind = kind.capitalize()
+    lines = []
+    lines.append("// SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna")
+    lines.append("// SPDX-License-Identifier: Apache-2.0")
+    lines.append("// Auto-generated by generateNetwork_xdna2.py — do not edit.")
+    lines.append("#pragma once")
+    lines.append("#include <stdint.h>")
+    lines.append("")
+    vec_names = []
+    for idx, arr in enumerate(arrays):
+        bf16 = _float32_to_bf16_uint16(arr.flatten())
+        n = len(bf16)
+        name = f"test{Kind}Vector{idx}"
+        vec_names.append(name)
+        hex_vals = ", ".join(f"0x{v:04x}u" for v in bf16)
+        lines.append(f"static const uint16_t {name}[{n}] = {{{hex_vals}}};")
+        lines.append(f"#define N_ELEMENTS_{kind.upper()}{idx} {n}u")
+        lines.append("")
+    lines.append(f"static const void *test{Kind}Vector[{len(vec_names)}] = {{")
+    lines.append("    " + ", ".join(f"(const void *){n}" for n in vec_names))
+    lines.append("};")
+    lines.append("")
+    return "\n".join(lines)

Call sites:

testInputStr  = _generate_xdna2_tensor_header(test_inputs_f32,  "input")
testOutputStr = _generate_xdna2_tensor_header(test_outputs_bf16, "output")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@DeeployTest/generateNetwork_xdna2.py` around lines 58 - 111, Both
_generate_xdna2_inputs_header and _generate_xdna2_outputs_header are duplicated;
replace them with a single parameterized helper (e.g.
_generate_xdna2_tensor_header(arrays: list, kind: str)) that takes the arrays
and a kind token ("input" or "output") and emits the same content but using kind
to form names like testInputVector/testOutputVector,
N_ELEMENTS_INPUT#/N_ELEMENTS_OUTPUT#, and testInputVector/testOutputVector
table; update call sites that use _generate_xdna2_inputs_header and
_generate_xdna2_outputs_header to call the new _generate_xdna2_tensor_header
with the appropriate kind string, reusing the same internal logic (flatten ->
_float32_to_bf16_uint16, name generation, hex formatting, and final pointer
array construction).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@DeeployTest/generateNetwork_xdna2.py`:
- Line 61: The autogenerated header strings in _generate_xdna2_inputs_header and
_generate_xdna2_outputs_header embed the wrong copyright year (2025); update the
hardcoded year in the string appended (the line that builds "//
SPDX-FileCopyrightText: 2025 ...") to 2026 in both functions so the generated
files show 2026 consistently.
- Line 129: Rename the unused loop variable name to _name and add strict=True to
the zip call to satisfy Ruff warnings; locate the loop using
enumerate(zip(inputs_npz.files, test_inputs_f32)) in generateNetwork_xdna2.py
(the loop that iterates over inputs_npz.files and test_inputs_f32) and change it
so the zip is called with strict=True and the tuple unpacks to (_name, values)
so the unused first element is prefixed with an underscore.

In `@DeeployTest/Platforms/XDNA2/CMakeLists.txt`:
- Around line 71-73: Replace the hardcoded "python" invocation in the CMake
COMMAND that runs AIECC (the COMMAND using ${CMAKE_COMMAND} -E env with
"PATH=${MLIR_AIE_INSTALL_DIR}/bin:$ENV{PATH}" "python" "${AIECC_PY}") with the
resolved Python 3 executable variable ${Python3_EXECUTABLE}; locate the COMMAND
in CMakeLists.txt that references AIECC_PY and swap the "python" token for
${Python3_EXECUTABLE} so the script runs with the intended Python 3 interpreter.

In `@README_XDNA.md`:
- Around line 6-8: Add language identifiers (bash) to all fenced code blocks in
README_XDNA.md: update the docker build command block ("docker build -f
Container/Dockerfile.deeploy-xdna -t deeploy-xdna:local ."), the docker run
block (the multi-line run with --device/--ulimit/-v/--name), the pip install +
python deeployRunner_xdna2.py block, the indented CI/example docker build
snippet, the actions-runner setup block (mkdir actions-runner, ./config.sh --url
... --labels xdna2-npu), and the svc.sh commands (sudo ./svc.sh install/start)
by changing each opening ``` to ```bash so markdownlint MD040 is satisfied.

---

Nitpick comments:
In `@Deeploy/Targets/XDNA2/Platform.py`:
- Around line 118-149: Both MemoryXDNA2Platform.getTargetMemoryLevel and
MemoryXDNA2PlatformWrapper.getTargetMemoryLevel duplicate the same logic;
extract that shared logic into a module-level helper (e.g.,
resolve_target_memory_level(engine, default_level)) and have both methods call
it. In the helper use isinstance(engine, XDNA2AIECoreEngine) to return
engine.preferredMemoryLevel and otherwise return default_level.name; drop the
redundant hasattr(engine, 'preferredMemoryLevel') guard because
XDNA2AIECoreEngine always defines it. Update references in getTargetMemoryLevel
in both classes to call the new helper and remove the duplicated code.

In `@DeeployTest/generateNetwork_xdna2.py`:
- Around line 58-111: Both _generate_xdna2_inputs_header and
_generate_xdna2_outputs_header are duplicated; replace them with a single
parameterized helper (e.g. _generate_xdna2_tensor_header(arrays: list, kind:
str)) that takes the arrays and a kind token ("input" or "output") and emits the
same content but using kind to form names like testInputVector/testOutputVector,
N_ELEMENTS_INPUT#/N_ELEMENTS_OUTPUT#, and testInputVector/testOutputVector
table; update call sites that use _generate_xdna2_inputs_header and
_generate_xdna2_outputs_header to call the new _generate_xdna2_tensor_header
with the appropriate kind string, reusing the same internal logic (flatten ->
_float32_to_bf16_uint16, name generation, hex formatting, and final pointer
array construction).

In `@TargetLibraries/XDNA2/CMakeLists.txt`:
- Line 29: The file(GLOB ...) invocation that defines XDNA2_KERNEL_SOURCES
should be updated to include the CONFIGURE_DEPENDS option so CMake will re-run
configuration when files under kernels/ change; locate the file(GLOB
XDNA2_KERNEL_SOURCES "${CMAKE_CURRENT_LIST_DIR}/kernels/*.cc") line in
CMakeLists.txt and add CONFIGURE_DEPENDS to that file(GLOB) call to ensure new
*.cc kernel files are detected and the build is not silently stale.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ba7431d7-ddea-48c6-80e0-cebc4b8867f8

📥 Commits

Reviewing files that changed from the base of the PR and between acd5680 and cde0c5d.

📒 Files selected for processing (21)

.github/workflows/_runner-xdna2.yml
.github/workflows/ci-platform-xdna2.yml
Deeploy/Targets/XDNA2/Bindings.py
Deeploy/Targets/XDNA2/Deployer.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Platform.py
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
Deeploy/Targets/XDNA2/Templates/__init__.py
Deeploy/Targets/XDNA2/Tiler.py
Deeploy/Targets/XDNA2/TypeCheckers.py
DeeployTest/Platforms/XDNA2/CMakeLists.txt
DeeployTest/Platforms/XDNA2/main.cpp
DeeployTest/deeployRunner_xdna2.py
DeeployTest/generateNetwork_xdna2.py
DeeployTest/test_xdna2_config.py
README_XDNA.md
TargetLibraries/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/kernels/add.cc
cmake/xdna2/llvm_aie.cmake
cmake/xdna2/mlir_aie.cmake
requirements-xdna.txt

✅ Files skipped from review due to trivial changes (4)

requirements-xdna.txt
Deeploy/Targets/XDNA2/Templates/init.py
cmake/xdna2/mlir_aie.cmake
Deeploy/Targets/XDNA2/Parsers.py

🚧 Files skipped from review as they are similar to previous changes (7)

DeeployTest/test_xdna2_config.py
Deeploy/Targets/XDNA2/Tiler.py
.github/workflows/ci-platform-xdna2.yml
Deeploy/Targets/XDNA2/Bindings.py
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
DeeployTest/deeployRunner_xdna2.py
Deeploy/Targets/XDNA2/Deployer.py

Xeratec · 2026-05-21T15:18:38Z

Thanks for implementing the changes! I think we are almost there. I resolved most of the comments. Please check the remaining ones, rebase your work, update the Changelog, and let Claude extend the PR description with changes, fixes, and added stuff. Once the CI passes, I can take another quick look and hopefully merge it.

…release

…code transformation passes

…nstallation

…IR-AIE repo

Triggering the CI

coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

Container/Dockerfile.deeploy-xdna (1)
5-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Container never drops root privileges.

The image is used to run tests and mounts host paths/devices; running as root increases blast radius if test code is compromised.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` around lines 5 - 57, The Dockerfile does
not define or switch to a non-root user before running applications, leaving the
container running as root which is a security risk. Create a non-root user (such
as appuser) after the package cleanup section using appropriate user creation
commands, set ownership of the /app directory to this user, and add a USER
instruction at the end of the Dockerfile (before or at the WORKDIR /app/Deeploy
line) to ensure the container runs as this non-root user instead of root.
Source: Linters/SAST tools

🧹 Nitpick comments (3)

Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py (1)

67-87: ⚡ Quick win

Avoid assuming one FIFO element type for every tensor.

At Line 69, tileTy is derived from the first input key and then reused for all acquires at Lines 80-87. That couples the pass to homogeneous tensor types, despite this class being documented as operator-agnostic. Use mlirBlock.fifoTypes[key] per tensor key when creating subviews/accesses.

Proposed refactor

-        # Use the first tensor's type as representative tile memref type
-        firstKey = self.inputTensorKeys[0]
-        tileTy = mlirBlock.fifoTypes[firstKey]
-
         `@aie_d.core`(computeTile)
         def _core():
-            subviewTy = aie_d.ObjectFifoSubviewType.get(tileTy)
             for _ in scf_d.for_(0, 0x7FFFFFFFFFFFFFFF, 1):
                 for _ in scf_d.for_(0, numTiles, 1):
                     # Acquire all input FIFO elements
                     acquiredElements = {}
                     for key in self.inputTensorKeys:
+                        tileTy = mlirBlock.fifoTypes[key]
+                        subviewTy = aie_d.ObjectFifoSubviewType.get(tileTy)
                         fifoName = mlirBlock.fifoMap[key]
                         acq = aie_d.objectfifo_acquire(subviewTy, aie_d.ObjectFifoPort.Consume, fifoName, 1)
                         acquiredElements[key] = aie_d.objectfifo_subview_access(tileTy, acq, 0)

                     # Acquire all output FIFO elements
                     for key in self.outputTensorKeys:
+                        tileTy = mlirBlock.fifoTypes[key]
+                        subviewTy = aie_d.ObjectFifoSubviewType.get(tileTy)
                         fifoName = mlirBlock.fifoMap[key]
                         acq = aie_d.objectfifo_acquire(subviewTy, aie_d.ObjectFifoPort.Produce, fifoName, 1)
                         acquiredElements[key] = aie_d.objectfifo_subview_access(tileTy, acq, 0)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py` around
lines 67 - 87, The _core() function currently uses a single tileTy derived from
the first input tensor key and reuses it for all FIFO acquisitions, which
assumes homogeneous tensor types. Instead of using the shared tileTy variable
for all acquire and subview_access operations in the input and output tensor
loops, retrieve the specific fifo type for each tensor key using
mlirBlock.fifoTypes[key] and use that type when creating the
objectfifo_subview_access call for each individual tensor key in both the input
and output acquisition loops.

Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py (1)

5-7: ⚡ Quick win

Replace wildcard re-exports with explicit imports and __all__.

This improves static analysis and avoids accidental namespace leakage.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py` around lines 5 -
7, Replace the wildcard imports from the three modules (MLIRComputeCorePass,
MLIRObjectFifoPass, and MLIRRuntimeSequencePass) with explicit imports of the
specific classes or functions that each module exports. After converting the
imports to be explicit, add an __all__ list at the module level that explicitly
defines which symbols should be re-exported from this package, ensuring that
only the intended public API is exposed.

Source: Linters/SAST tools

Container/Dockerfile.deeploy-xdna (1)

16-37: ⚡ Quick win

Use --no-install-recommends for apt installs.

This trims unnecessary packages and reduces image size and attack surface.

Suggested patch

-RUN apt-get update && apt-get install -y \
+RUN apt-get update && apt-get install -y --no-install-recommends \
     software-properties-common \
     && add-apt-repository -y ppa:amd-team/xrt \
-    && apt-get update && apt-get install -y \
+    && apt-get update && apt-get install -y --no-install-recommends \
     cmake \
     ninja-build \
     g++ \

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` around lines 16 - 37, The apt-get install
commands in the RUN instruction do not use the --no-install-recommends flag,
which results in unnecessary dependencies being installed and increasing the
Docker image size. Add the --no-install-recommends flag to both apt-get install
-y commands (the one installing software-properties-common and the one
installing the subsequent packages like cmake, ninja-build, g++, etc.) to
prevent the installation of recommended but non-essential packages and reduce
the final image size and attack surface.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/_runner-xdna2.yml:
- Around line 35-38: The Checkout Repo step using actions/checkout@v4 is keeping
credentials in the git config by default, which poses a security risk when the
workspace is mounted into Docker containers on the self-hosted runner. Add the
persist-credentials option set to false in the with section of the
actions/checkout@v4 step to prevent the token from being stored locally and
reduce unnecessary credential exposure.

In @.github/workflows/docker-build-deeploy-xdna.yml:
- Line 18: Replace all mutable version tag references (such as `@v4`) for GitHub
Actions with immutable commit SHAs throughout the docker-build-deeploy-xdna.yml
workflow file. Specifically, update the actions/checkout action at line 18 and
all other action references mentioned at lines 42, 45, 53, 56, 64, 70, and 86 by
converting from version tags like `@v4` to their corresponding full commit SHAs.
This change hardens the CI/CD supply chain security by ensuring that action
definitions cannot be modified after being pinned.
- Line 11: Add an explicit `permissions` block at the top level of the workflow
file (at the same indentation level as the `jobs` key) to define least-privilege
token permissions. For a workflow that pushes to GHCR, include the minimum
required permissions such as `packages: write` for the container registry push
operation, and any other read permissions needed for the specific workflow
steps. This replaces reliance on default token permissions with explicitly
granted capabilities.
- Around line 22-33: The docker tag generation in the Set Docker tag step
creates collision risk by deriving tags from only the last path segment of
GITHUB_REF using parameter expansion (##*/). This means different branches with
identical suffixes will generate the same docker tag, overwriting each other.
Instead of using only the last segment for BRANCH_NAME and TAG_NAME, use the
full GITHUB_REF value or incorporate additional unique context such as
GITHUB_SHA to ensure docker tags are unique across all branches and prevent tag
collisions. Update the echo statements that set BRANCH_NAME and TAG_NAME to use
the complete ref information rather than just the final path component.

In `@CHANGELOG.md`:
- Around line 19-20: The markdown links in the CHANGELOG.md file for PR `#189` and
`#193` have malformed syntax with a leading space between the opening parenthesis
and the URL. Remove the space in both link references so that the markdown link
format changes from `]( https://` to `](https://` for both the `#189` and `#193`
entries to ensure proper link parsing across all markdown renderers.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py`:
- Around line 56-59: Add a guard clause before the tile size adjustment logic in
the code block containing the modulo operation on numElements. Check if
numElements is zero or invalid (less than or equal to zero) and handle this case
separately before attempting any divisor operations, as zero-sized tensors
should not proceed through the tileSize adjustment logic that divides
numElements by various values.

In `@Deeploy/Targets/XDNA2/Platform.py`:
- Around line 144-146: The constructor in the XDNA2Platform class uses assert
isinstance for type validation, which will be removed when Python runs in
optimized mode (-O flag), allowing invalid objects to bypass validation. Replace
the assert isinstance check with an explicit raise TypeError statement,
maintaining the same descriptive error message. Apply the same change to the
similar validation patterns found in the Deeploy/Targets/PULPOpen/Platform.py,
Deeploy/Targets/GAP9/Platform.py, and Deeploy/Targets/Neureka/Platform.py files
for consistency.

In `@DeeployTest/generateNetwork_xdna2.py`:
- Around line 143-144: The l1_size and l3_size variable assignments on lines
143-144 do not properly validate that provided command-line arguments are
strictly positive values. Currently, zero and negative values can pass through,
and zero is incorrectly treated as "use default". Modify both the l1_size and
l3_size assignments to explicitly check if the retrieved argument value is None
(in which case use the default), and if it is not None, validate that it is
greater than zero before converting to int. If validation fails (value is not
None and not greater than zero), raise an appropriate error to prevent invalid
memory configurations from being created downstream.

In `@DeeployTest/Platforms/XDNA2/main.cpp`:
- Around line 93-102: The current argument parsing assumes positional arguments
are always at fixed positions (argv[1] and argv[2]), which fails when flags like
-v appear first. Instead of checking specific argv indices for xclbin_path and
instr_path, refactor the argument parsing loop to track positional arguments
(those not starting with '-') separately and assign them in order: the first
positional argument to xclbin_path and the second positional argument to
instr_path. This allows flags to appear in any position without breaking the
assignment of positional arguments.

In `@DeeployTest/testUtils/core/execution.py`:
- Around line 205-210: The subprocess.Popen call on lines 205-210 is using
shell=True with a joined command string, which reintroduces command-injection
security risk, and is not passing the env parameter to preserve the VERBOSE and
BANSHEE_LOG environment variables that were prepared earlier on lines 191-202.
Remove the cmd_str variable assignment that joins the cmd list with spaces, pass
the cmd list directly to subprocess.Popen without shell=True, and add the env
parameter to the Popen call to pass through the prepared environment variables
so the runtime configurations are not dropped.

In `@DeeployTest/testUtils/platformMapping.py`:
- Around line 286-287: The except ImportError block raises a RuntimeError
without preserving the original exception cause, losing valuable debugging
information for missing optional dependencies. Capture the ImportError exception
by adding `as e` to the except clause, then modify the raise statement to use
`raise RuntimeError(f"Deployer for platform {platform} is not implemented") from
e` to chain the exceptions and maintain the original error context.

In `@requirements-xdna.txt`:
- Around line 5-7: The GitHub expanded_assets URLs on lines 5-6 are not PEP 503
compliant indexes and should use `--find-links` instead of `--extra-index-url`.
Replace `--extra-index-url` with `--find-links` for both the mlir-aie and
llvm-aie GitHub release URLs. Keep the PyPI URL on line 7 as `--extra-index-url`
since it needs to remain a proper index to resolve transitive dependencies for
the Xilinx packages.

---

Outside diff comments:
In `@Container/Dockerfile.deeploy-xdna`:
- Around line 5-57: The Dockerfile does not define or switch to a non-root user
before running applications, leaving the container running as root which is a
security risk. Create a non-root user (such as appuser) after the package
cleanup section using appropriate user creation commands, set ownership of the
/app directory to this user, and add a USER instruction at the end of the
Dockerfile (before or at the WORKDIR /app/Deeploy line) to ensure the container
runs as this non-root user instead of root.

---

Nitpick comments:
In `@Container/Dockerfile.deeploy-xdna`:
- Around line 16-37: The apt-get install commands in the RUN instruction do not
use the --no-install-recommends flag, which results in unnecessary dependencies
being installed and increasing the Docker image size. Add the
--no-install-recommends flag to both apt-get install -y commands (the one
installing software-properties-common and the one installing the subsequent
packages like cmake, ninja-build, g++, etc.) to prevent the installation of
recommended but non-essential packages and reduce the final image size and
attack surface.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py`:
- Around line 5-7: Replace the wildcard imports from the three modules
(MLIRComputeCorePass, MLIRObjectFifoPass, and MLIRRuntimeSequencePass) with
explicit imports of the specific classes or functions that each module exports.
After converting the imports to be explicit, add an __all__ list at the module
level that explicitly defines which symbols should be re-exported from this
package, ensuring that only the intended public API is exposed.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py`:
- Around line 67-87: The _core() function currently uses a single tileTy derived
from the first input tensor key and reuses it for all FIFO acquisitions, which
assumes homogeneous tensor types. Instead of using the shared tileTy variable
for all acquire and subview_access operations in the input and output tensor
loops, retrieve the specific fifo type for each tensor key using
mlirBlock.fifoTypes[key] and use that type when creating the
objectfifo_subview_access call for each individual tensor key in both the input
and output acquisition loops.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2ebbf7f4-adb1-4423-8073-2f5d4ddb78cb

📥 Commits

Reviewing files that changed from the base of the PR and between cde0c5d and b8e9e34.

📒 Files selected for processing (38)

.github/workflows/_runner-xdna2.yml
.github/workflows/ci-platform-xdna2.yml
.github/workflows/docker-build-deeploy-xdna.yml
.gitignore
CHANGELOG.md
CMakeLists.txt
Container/Dockerfile.deeploy-xdna
Deeploy/MLIRDataTypes.py
Deeploy/Targets/XDNA2/Bindings.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py
Deeploy/Targets/XDNA2/Deployer.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Platform.py
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
Deeploy/Targets/XDNA2/Templates/__init__.py
Deeploy/Targets/XDNA2/Tiler.py
Deeploy/Targets/XDNA2/TypeCheckers.py
DeeployTest/Platforms/XDNA2/CMakeLists.txt
DeeployTest/Platforms/XDNA2/main.cpp
DeeployTest/Tests/Kernels/BF16/Add/Regular/inputs.npz
DeeployTest/Tests/Kernels/BF16/Add/Regular/network.onnx
DeeployTest/Tests/Kernels/BF16/Add/Regular/outputs.npz
DeeployTest/conftest.py
DeeployTest/deeployRunner_xdna2.py
DeeployTest/generateNetwork_xdna2.py
DeeployTest/testUtils/core/execution.py
DeeployTest/testUtils/deeployRunner.py
DeeployTest/testUtils/platformMapping.py
DeeployTest/testUtils/testRunner.py
DeeployTest/test_platforms.py
DeeployTest/test_xdna2_config.py
README_XDNA.md
TargetLibraries/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/kernels/add.cc
requirements-xdna.txt

✅ Files skipped from review due to trivial changes (4)

.gitignore
Deeploy/Targets/XDNA2/Templates/init.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Tiler.py

🚧 Files skipped from review as they are similar to previous changes (14)

CMakeLists.txt
.github/workflows/ci-platform-xdna2.yml
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
DeeployTest/deeployRunner_xdna2.py
Deeploy/Targets/XDNA2/Bindings.py
DeeployTest/testUtils/deeployRunner.py
DeeployTest/test_platforms.py
Deeploy/Targets/XDNA2/Deployer.py
TargetLibraries/XDNA2/kernels/add.cc
DeeployTest/Platforms/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/CMakeLists.txt
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
DeeployTest/conftest.py
DeeployTest/test_xdna2_config.py

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

Container/Dockerfile.deeploy-xdna (1)
5-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Container never drops root privileges.

The image is used to run tests and mounts host paths/devices; running as root increases blast radius if test code is compromised.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` around lines 5 - 57, The Dockerfile does
not define or switch to a non-root user before running applications, leaving the
container running as root which is a security risk. Create a non-root user (such
as appuser) after the package cleanup section using appropriate user creation
commands, set ownership of the /app directory to this user, and add a USER
instruction at the end of the Dockerfile (before or at the WORKDIR /app/Deeploy
line) to ensure the container runs as this non-root user instead of root.
Source: Linters/SAST tools

🧹 Nitpick comments (3)

Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py (1)

67-87: ⚡ Quick win

Avoid assuming one FIFO element type for every tensor.

At Line 69, tileTy is derived from the first input key and then reused for all acquires at Lines 80-87. That couples the pass to homogeneous tensor types, despite this class being documented as operator-agnostic. Use mlirBlock.fifoTypes[key] per tensor key when creating subviews/accesses.

Proposed refactor

-        # Use the first tensor's type as representative tile memref type
-        firstKey = self.inputTensorKeys[0]
-        tileTy = mlirBlock.fifoTypes[firstKey]
-
         `@aie_d.core`(computeTile)
         def _core():
-            subviewTy = aie_d.ObjectFifoSubviewType.get(tileTy)
             for _ in scf_d.for_(0, 0x7FFFFFFFFFFFFFFF, 1):
                 for _ in scf_d.for_(0, numTiles, 1):
                     # Acquire all input FIFO elements
                     acquiredElements = {}
                     for key in self.inputTensorKeys:
+                        tileTy = mlirBlock.fifoTypes[key]
+                        subviewTy = aie_d.ObjectFifoSubviewType.get(tileTy)
                         fifoName = mlirBlock.fifoMap[key]
                         acq = aie_d.objectfifo_acquire(subviewTy, aie_d.ObjectFifoPort.Consume, fifoName, 1)
                         acquiredElements[key] = aie_d.objectfifo_subview_access(tileTy, acq, 0)

                     # Acquire all output FIFO elements
                     for key in self.outputTensorKeys:
+                        tileTy = mlirBlock.fifoTypes[key]
+                        subviewTy = aie_d.ObjectFifoSubviewType.get(tileTy)
                         fifoName = mlirBlock.fifoMap[key]
                         acq = aie_d.objectfifo_acquire(subviewTy, aie_d.ObjectFifoPort.Produce, fifoName, 1)
                         acquiredElements[key] = aie_d.objectfifo_subview_access(tileTy, acq, 0)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py` around
lines 67 - 87, The _core() function currently uses a single tileTy derived from
the first input tensor key and reuses it for all FIFO acquisitions, which
assumes homogeneous tensor types. Instead of using the shared tileTy variable
for all acquire and subview_access operations in the input and output tensor
loops, retrieve the specific fifo type for each tensor key using
mlirBlock.fifoTypes[key] and use that type when creating the
objectfifo_subview_access call for each individual tensor key in both the input
and output acquisition loops.

Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py (1)

5-7: ⚡ Quick win

Replace wildcard re-exports with explicit imports and __all__.

This improves static analysis and avoids accidental namespace leakage.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py` around lines 5 -
7, Replace the wildcard imports from the three modules (MLIRComputeCorePass,
MLIRObjectFifoPass, and MLIRRuntimeSequencePass) with explicit imports of the
specific classes or functions that each module exports. After converting the
imports to be explicit, add an __all__ list at the module level that explicitly
defines which symbols should be re-exported from this package, ensuring that
only the intended public API is exposed.

Source: Linters/SAST tools

Container/Dockerfile.deeploy-xdna (1)

16-37: ⚡ Quick win

Use --no-install-recommends for apt installs.

This trims unnecessary packages and reduces image size and attack surface.

Suggested patch

-RUN apt-get update && apt-get install -y \
+RUN apt-get update && apt-get install -y --no-install-recommends \
     software-properties-common \
     && add-apt-repository -y ppa:amd-team/xrt \
-    && apt-get update && apt-get install -y \
+    && apt-get update && apt-get install -y --no-install-recommends \
     cmake \
     ninja-build \
     g++ \

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Container/Dockerfile.deeploy-xdna` around lines 16 - 37, The apt-get install
commands in the RUN instruction do not use the --no-install-recommends flag,
which results in unnecessary dependencies being installed and increasing the
Docker image size. Add the --no-install-recommends flag to both apt-get install
-y commands (the one installing software-properties-common and the one
installing the subsequent packages like cmake, ninja-build, g++, etc.) to
prevent the installation of recommended but non-essential packages and reduce
the final image size and attack surface.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/_runner-xdna2.yml:
- Around line 35-38: The Checkout Repo step using actions/checkout@v4 is keeping
credentials in the git config by default, which poses a security risk when the
workspace is mounted into Docker containers on the self-hosted runner. Add the
persist-credentials option set to false in the with section of the
actions/checkout@v4 step to prevent the token from being stored locally and
reduce unnecessary credential exposure.

In @.github/workflows/docker-build-deeploy-xdna.yml:
- Line 18: Replace all mutable version tag references (such as `@v4`) for GitHub
Actions with immutable commit SHAs throughout the docker-build-deeploy-xdna.yml
workflow file. Specifically, update the actions/checkout action at line 18 and
all other action references mentioned at lines 42, 45, 53, 56, 64, 70, and 86 by
converting from version tags like `@v4` to their corresponding full commit SHAs.
This change hardens the CI/CD supply chain security by ensuring that action
definitions cannot be modified after being pinned.
- Line 11: Add an explicit `permissions` block at the top level of the workflow
file (at the same indentation level as the `jobs` key) to define least-privilege
token permissions. For a workflow that pushes to GHCR, include the minimum
required permissions such as `packages: write` for the container registry push
operation, and any other read permissions needed for the specific workflow
steps. This replaces reliance on default token permissions with explicitly
granted capabilities.
- Around line 22-33: The docker tag generation in the Set Docker tag step
creates collision risk by deriving tags from only the last path segment of
GITHUB_REF using parameter expansion (##*/). This means different branches with
identical suffixes will generate the same docker tag, overwriting each other.
Instead of using only the last segment for BRANCH_NAME and TAG_NAME, use the
full GITHUB_REF value or incorporate additional unique context such as
GITHUB_SHA to ensure docker tags are unique across all branches and prevent tag
collisions. Update the echo statements that set BRANCH_NAME and TAG_NAME to use
the complete ref information rather than just the final path component.

In `@CHANGELOG.md`:
- Around line 19-20: The markdown links in the CHANGELOG.md file for PR `#189` and
`#193` have malformed syntax with a leading space between the opening parenthesis
and the URL. Remove the space in both link references so that the markdown link
format changes from `]( https://` to `](https://` for both the `#189` and `#193`
entries to ensure proper link parsing across all markdown renderers.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py`:
- Around line 56-59: Add a guard clause before the tile size adjustment logic in
the code block containing the modulo operation on numElements. Check if
numElements is zero or invalid (less than or equal to zero) and handle this case
separately before attempting any divisor operations, as zero-sized tensors
should not proceed through the tileSize adjustment logic that divides
numElements by various values.

In `@Deeploy/Targets/XDNA2/Platform.py`:
- Around line 144-146: The constructor in the XDNA2Platform class uses assert
isinstance for type validation, which will be removed when Python runs in
optimized mode (-O flag), allowing invalid objects to bypass validation. Replace
the assert isinstance check with an explicit raise TypeError statement,
maintaining the same descriptive error message. Apply the same change to the
similar validation patterns found in the Deeploy/Targets/PULPOpen/Platform.py,
Deeploy/Targets/GAP9/Platform.py, and Deeploy/Targets/Neureka/Platform.py files
for consistency.

In `@DeeployTest/generateNetwork_xdna2.py`:
- Around line 143-144: The l1_size and l3_size variable assignments on lines
143-144 do not properly validate that provided command-line arguments are
strictly positive values. Currently, zero and negative values can pass through,
and zero is incorrectly treated as "use default". Modify both the l1_size and
l3_size assignments to explicitly check if the retrieved argument value is None
(in which case use the default), and if it is not None, validate that it is
greater than zero before converting to int. If validation fails (value is not
None and not greater than zero), raise an appropriate error to prevent invalid
memory configurations from being created downstream.

In `@DeeployTest/Platforms/XDNA2/main.cpp`:
- Around line 93-102: The current argument parsing assumes positional arguments
are always at fixed positions (argv[1] and argv[2]), which fails when flags like
-v appear first. Instead of checking specific argv indices for xclbin_path and
instr_path, refactor the argument parsing loop to track positional arguments
(those not starting with '-') separately and assign them in order: the first
positional argument to xclbin_path and the second positional argument to
instr_path. This allows flags to appear in any position without breaking the
assignment of positional arguments.

In `@DeeployTest/testUtils/core/execution.py`:
- Around line 205-210: The subprocess.Popen call on lines 205-210 is using
shell=True with a joined command string, which reintroduces command-injection
security risk, and is not passing the env parameter to preserve the VERBOSE and
BANSHEE_LOG environment variables that were prepared earlier on lines 191-202.
Remove the cmd_str variable assignment that joins the cmd list with spaces, pass
the cmd list directly to subprocess.Popen without shell=True, and add the env
parameter to the Popen call to pass through the prepared environment variables
so the runtime configurations are not dropped.

In `@DeeployTest/testUtils/platformMapping.py`:
- Around line 286-287: The except ImportError block raises a RuntimeError
without preserving the original exception cause, losing valuable debugging
information for missing optional dependencies. Capture the ImportError exception
by adding `as e` to the except clause, then modify the raise statement to use
`raise RuntimeError(f"Deployer for platform {platform} is not implemented") from
e` to chain the exceptions and maintain the original error context.

In `@requirements-xdna.txt`:
- Around line 5-7: The GitHub expanded_assets URLs on lines 5-6 are not PEP 503
compliant indexes and should use `--find-links` instead of `--extra-index-url`.
Replace `--extra-index-url` with `--find-links` for both the mlir-aie and
llvm-aie GitHub release URLs. Keep the PyPI URL on line 7 as `--extra-index-url`
since it needs to remain a proper index to resolve transitive dependencies for
the Xilinx packages.

---

Outside diff comments:
In `@Container/Dockerfile.deeploy-xdna`:
- Around line 5-57: The Dockerfile does not define or switch to a non-root user
before running applications, leaving the container running as root which is a
security risk. Create a non-root user (such as appuser) after the package
cleanup section using appropriate user creation commands, set ownership of the
/app directory to this user, and add a USER instruction at the end of the
Dockerfile (before or at the WORKDIR /app/Deeploy line) to ensure the container
runs as this non-root user instead of root.

---

Nitpick comments:
In `@Container/Dockerfile.deeploy-xdna`:
- Around line 16-37: The apt-get install commands in the RUN instruction do not
use the --no-install-recommends flag, which results in unnecessary dependencies
being installed and increasing the Docker image size. Add the
--no-install-recommends flag to both apt-get install -y commands (the one
installing software-properties-common and the one installing the subsequent
packages like cmake, ninja-build, g++, etc.) to prevent the installation of
recommended but non-essential packages and reduce the final image size and
attack surface.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py`:
- Around line 5-7: Replace the wildcard imports from the three modules
(MLIRComputeCorePass, MLIRObjectFifoPass, and MLIRRuntimeSequencePass) with
explicit imports of the specific classes or functions that each module exports.
After converting the imports to be explicit, add an __all__ list at the module
level that explicitly defines which symbols should be re-exported from this
package, ensuring that only the intended public API is exposed.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py`:
- Around line 67-87: The _core() function currently uses a single tileTy derived
from the first input tensor key and reuses it for all FIFO acquisitions, which
assumes homogeneous tensor types. Instead of using the shared tileTy variable
for all acquire and subview_access operations in the input and output tensor
loops, retrieve the specific fifo type for each tensor key using
mlirBlock.fifoTypes[key] and use that type when creating the
objectfifo_subview_access call for each individual tensor key in both the input
and output acquisition loops.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2ebbf7f4-adb1-4423-8073-2f5d4ddb78cb

📥 Commits

Reviewing files that changed from the base of the PR and between cde0c5d and b8e9e34.

📒 Files selected for processing (38)

.github/workflows/_runner-xdna2.yml
.github/workflows/ci-platform-xdna2.yml
.github/workflows/docker-build-deeploy-xdna.yml
.gitignore
CHANGELOG.md
CMakeLists.txt
Container/Dockerfile.deeploy-xdna
Deeploy/MLIRDataTypes.py
Deeploy/Targets/XDNA2/Bindings.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRComputeCorePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
Deeploy/Targets/XDNA2/CodeTransformationPasses/__init__.py
Deeploy/Targets/XDNA2/Deployer.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Platform.py
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
Deeploy/Targets/XDNA2/Templates/__init__.py
Deeploy/Targets/XDNA2/Tiler.py
Deeploy/Targets/XDNA2/TypeCheckers.py
DeeployTest/Platforms/XDNA2/CMakeLists.txt
DeeployTest/Platforms/XDNA2/main.cpp
DeeployTest/Tests/Kernels/BF16/Add/Regular/inputs.npz
DeeployTest/Tests/Kernels/BF16/Add/Regular/network.onnx
DeeployTest/Tests/Kernels/BF16/Add/Regular/outputs.npz
DeeployTest/conftest.py
DeeployTest/deeployRunner_xdna2.py
DeeployTest/generateNetwork_xdna2.py
DeeployTest/testUtils/core/execution.py
DeeployTest/testUtils/deeployRunner.py
DeeployTest/testUtils/platformMapping.py
DeeployTest/testUtils/testRunner.py
DeeployTest/test_platforms.py
DeeployTest/test_xdna2_config.py
README_XDNA.md
TargetLibraries/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/kernels/add.cc
requirements-xdna.txt

✅ Files skipped from review due to trivial changes (4)

.gitignore
Deeploy/Targets/XDNA2/Templates/init.py
Deeploy/Targets/XDNA2/Parsers.py
Deeploy/Targets/XDNA2/Tiler.py

🚧 Files skipped from review as they are similar to previous changes (14)

CMakeLists.txt
.github/workflows/ci-platform-xdna2.yml
Deeploy/Targets/XDNA2/Templates/AddTemplate.py
DeeployTest/deeployRunner_xdna2.py
Deeploy/Targets/XDNA2/Bindings.py
DeeployTest/testUtils/deeployRunner.py
DeeployTest/test_platforms.py
Deeploy/Targets/XDNA2/Deployer.py
TargetLibraries/XDNA2/kernels/add.cc
DeeployTest/Platforms/XDNA2/CMakeLists.txt
TargetLibraries/XDNA2/CMakeLists.txt
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRRuntimeSequencePass.py
DeeployTest/conftest.py
DeeployTest/test_xdna2_config.py

🛑 Comments failed to post (12)

.github/workflows/_runner-xdna2.yml (1)
35-38: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Disable checkout credential persistence before mounting workspace into containers.

actions/checkout keeps a token in local git config by default. In this job, the workspace is later mounted into Docker, which unnecessarily expands token exposure on a self-hosted runner.
Suggested patch
       - name: Checkout Repo
         uses: actions/checkout@v4
         with:
+          persist-credentials: false
           submodules: recursive
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
      - name: Checkout Repo
        uses: actions/checkout@v4
        with:
          persist-credentials: false
          submodules: recursive
🧰 Tools

🪛 zizmor (1.25.2)

[warning] 35-38: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

[error] 36-36: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/_runner-xdna2.yml around lines 35 - 38, The Checkout Repo
step using actions/checkout@v4 is keeping credentials in the git config by
default, which poses a security risk when the workspace is mounted into Docker
containers on the self-hosted runner. Add the persist-credentials option set to
false in the with section of the actions/checkout@v4 step to prevent the token
from being stored locally and reduce unnecessary credential exposure.
Source: Linters/SAST tools
.github/workflows/docker-build-deeploy-xdna.yml (3)
11-11: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add an explicit least-privilege permissions block.

This workflow pushes to GHCR and currently relies on default token permissions.
Suggested patch
 "on":
   workflow_dispatch:
+
+permissions:
+  contents: read
+  packages: write
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/docker-build-deeploy-xdna.yml at line 11, Add an explicit
`permissions` block at the top level of the workflow file (at the same
indentation level as the `jobs` key) to define least-privilege token
permissions. For a workflow that pushes to GHCR, include the minimum required
permissions such as `packages: write` for the container registry push operation,
and any other read permissions needed for the specific workflow steps. This
replaces reliance on default token permissions with explicitly granted
capabilities.
Source: Linters/SAST tools

18-18: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin action references to commit SHAs.

Version tags (@v*) are mutable; pinning to SHAs hardens the supply chain for CI.

Also applies to: 42-42, 45-45, 53-53, 56-56, 64-64, 70-70, 86-86

🧰 Tools

🪛 zizmor (1.25.2)

[warning] 18-18: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

[error] 18-18: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/docker-build-deeploy-xdna.yml at line 18, Replace all
mutable version tag references (such as `@v4`) for GitHub Actions with immutable
commit SHAs throughout the docker-build-deeploy-xdna.yml workflow file.
Specifically, update the actions/checkout action at line 18 and all other action
references mentioned at lines 42, 45, 53, 56, 64, 70, and 86 by converting from
version tags like `@v4` to their corresponding full commit SHAs. This change
hardens the CI/CD supply chain security by ensuring that action definitions
cannot be modified after being pinned.
Source: Linters/SAST tools

22-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Docker tag derivation can collide across branches.

Using only the last path segment of the ref can overwrite image tags for different branches with the same suffix.
Suggested patch
       - name: Set up environment variables
         run: |
-          echo "BRANCH_NAME=${GITHUB_REF##*/}" >> $GITHUB_ENV
-          echo "TAG_NAME=${GITHUB_REF##*/}" >> $GITHUB_ENV
+          echo "REF_NAME=${GITHUB_REF_NAME}" >> $GITHUB_ENV
           echo "IS_TAG=${GITHUB_REF_TYPE}" >> $GITHUB_ENV

       - name: Set Docker tag
         id: generate_tag
         run: |
           if [[ "${{ env.IS_TAG }}" == "tag" ]]; then
-            echo "docker_tag=${{ env.TAG_NAME }}" >> $GITHUB_OUTPUT
+            echo "docker_tag=${{ env.REF_NAME }}" >> $GITHUB_OUTPUT
           else
-            echo "docker_tag=${{ env.BRANCH_NAME }}" >> $GITHUB_OUTPUT
+            echo "docker_tag=${{ env.REF_NAME//\//- }}" >> $GITHUB_OUTPUT
           fi
🧰 Tools

🪛 zizmor (1.25.2)

[warning] 12-33: overly broad permissions (excessive-permissions): default permissions used due to no permissions: block

(excessive-permissions)

[warning] 29-29: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

[warning] 30-30: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

[warning] 32-32: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/docker-build-deeploy-xdna.yml around lines 22 - 33, The
docker tag generation in the Set Docker tag step creates collision risk by
deriving tags from only the last path segment of GITHUB_REF using parameter
expansion (##*/). This means different branches with identical suffixes will
generate the same docker tag, overwriting each other. Instead of using only the
last segment for BRANCH_NAME and TAG_NAME, use the full GITHUB_REF value or
incorporate additional unique context such as GITHUB_SHA to ensure docker tags
are unique across all branches and prevent tag collisions. Update the echo
statements that set BRANCH_NAME and TAG_NAME to use the complete ref information
rather than just the final path component.
CHANGELOG.md (1)
19-20: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix malformed Markdown links for PR #189 and #193.

There is a leading space before the URL, which can break link parsing in some renderers.
Suggested patch
-- Fix for python error when using python 3.12.11 [`#189`]( https://github.com/pulp-platform/Deeploy/pull/189)
-- Add support for Operators for Generic target needed in MAGIA [`#193`]( https://github.com/pulp-platform/Deeploy/pull/193)
+- Fix for python error when using python 3.12.11 [`#189`](https://github.com/pulp-platform/Deeploy/pull/189)
+- Add support for Operators for Generic target needed in MAGIA [`#193`](https://github.com/pulp-platform/Deeploy/pull/193)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- Fix for python error when using python 3.12.11 [`#189`](https://github.com/pulp-platform/Deeploy/pull/189)
- Add support for Operators for Generic target needed in MAGIA [`#193`](https://github.com/pulp-platform/Deeploy/pull/193)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` around lines 19 - 20, The markdown links in the CHANGELOG.md
file for PR `#189` and `#193` have malformed syntax with a leading space between the
opening parenthesis and the URL. Remove the space in both link references so
that the markdown link format changes from `]( https://` to `](https://` for
both the `#189` and `#193` entries to ensure proper link parsing across all markdown
renderers.
Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py (1)
56-59: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard zero-sized tensors before divisor logic.

If numElements is 0, numElements % tileSize will raise due to zero divisor.
Suggested patch
 def _deriveTileSize(numElements: int, patternMemoryConstraint) -> int:
     """Extract tile size from the tiling solution, or fall back to MAX_TILE_SIZE."""
+    if numElements <= 0:
+        return 1
     tileSize = min(numElements, MAX_TILE_SIZE)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Deeploy/Targets/XDNA2/CodeTransformationPasses/MLIRObjectFifoPass.py` around
lines 56 - 59, Add a guard clause before the tile size adjustment logic in the
code block containing the modulo operation on numElements. Check if numElements
is zero or invalid (less than or equal to zero) and handle this case separately
before attempting any divisor operations, as zero-sized tensors should not
proceed through the tileSize adjustment logic that divides numElements by
various values.
Deeploy/Targets/XDNA2/Platform.py (1)
144-146: ⚠️ Potential issue | 🟡 Minor
🧩 Analysis chain

🏁 Script executed:
#!/bin/bash
set -euo pipefail

# Verify assert-based runtime guards in Python sources.
# Expected: this file appears in results; use output to decide if broader cleanup is needed.
rg -nP '^\s*assert\s+isinstance\s*\(' --type=py -C2
Repository: pulp-platform/Deeploy

Length of output: 26546

🏁 Script executed:
# Check if there are tests or code that catch AssertionError specifically
# Look for AssertionError handling in relation to Platform or XDNA2Platform
rg -A3 -B3 'except.*AssertionError|pytest.raises.*AssertionError' --type=py | head -50

# Also check if there are any test files that specifically test the assertion
rg 'XDNA2Platform|MemoryXDNA2PlatformWrapper' --type=py -l | xargs -I {} sh -c 'echo "=== {} ===" && grep -n "assert isinstance" {}'
Repository: pulp-platform/Deeploy

Length of output: 1366

🏁 Script executed:
# Verify no calling code relies on catching AssertionError from Platform constructors
rg 'MemoryXDNA2PlatformWrapper\(' --type=py -B2 -A5
Repository: pulp-platform/Deeploy

Length of output: 653
Use explicit exception instead of assert for constructor type validation.

Line 144 uses assert isinstance(...) for runtime validation. In optimized Python runs (python -O), that assertion is stripped, allowing invalid platform objects to bypass the constructor check and fail later in less obvious ways. Replace with an explicit TypeError.
Proposed fix
-        assert isinstance(platform, XDNA2Platform), \
-            f"Given platform is not an instance of XDNA2Platform. Platform type: {type(platform).__name__}"
+        if not isinstance(platform, XDNA2Platform):
+            raise TypeError(
+                f"Given platform is not an instance of XDNA2Platform. Platform type: {type(platform).__name__}"
+            )
Note: Similar patterns exist in Deeploy/Targets/PULPOpen/Platform.py, Deeploy/Targets/GAP9/Platform.py, and Deeploy/Targets/Neureka/Platform.py that should be updated for consistency.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Deeploy/Targets/XDNA2/Platform.py` around lines 144 - 146, The constructor in
the XDNA2Platform class uses assert isinstance for type validation, which will
be removed when Python runs in optimized mode (-O flag), allowing invalid
objects to bypass validation. Replace the assert isinstance check with an
explicit raise TypeError statement, maintaining the same descriptive error
message. Apply the same change to the similar validation patterns found in the
Deeploy/Targets/PULPOpen/Platform.py, Deeploy/Targets/GAP9/Platform.py, and
Deeploy/Targets/Neureka/Platform.py files for consistency.
DeeployTest/generateNetwork_xdna2.py (1)
143-144: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate --l1 / --l3 as strictly positive values.

Line 143 and Line 144 currently treat 0 as “use default” and allow negatives through. That can hide bad CLI input and create invalid memory configurations downstream. Parse as: None => default, otherwise require > 0.
Proposed fix
-    l1_size = int(getattr(args, 'l1', None) or 64000)  # 64KB default
-    l3_size = int(getattr(args, 'l3', None) or 128 * 1024 * 1024)  # 128MB default
+    l1_arg = getattr(args, 'l1', None)
+    l3_arg = getattr(args, 'l3', None)
+    if l1_arg is not None and l1_arg <= 0:
+        raise ValueError(f"Invalid L1 size: {l1_arg}. Expected a positive integer.")
+    if l3_arg is not None and l3_arg <= 0:
+        raise ValueError(f"Invalid L3 size: {l3_arg}. Expected a positive integer.")
+
+    l1_size = 64000 if l1_arg is None else int(l1_arg)  # 64KB default
+    l3_size = 128 * 1024 * 1024 if l3_arg is None else int(l3_arg)  # 128MB default
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
    l1_arg = getattr(args, 'l1', None)
    l3_arg = getattr(args, 'l3', None)
    if l1_arg is not None and l1_arg <= 0:
        raise ValueError(f"Invalid L1 size: {l1_arg}. Expected a positive integer.")
    if l3_arg is not None and l3_arg <= 0:
        raise ValueError(f"Invalid L3 size: {l3_arg}. Expected a positive integer.")

    l1_size = 64000 if l1_arg is None else int(l1_arg)  # 64KB default
    l3_size = 128 * 1024 * 1024 if l3_arg is None else int(l3_arg)  # 128MB default
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@DeeployTest/generateNetwork_xdna2.py` around lines 143 - 144, The l1_size and
l3_size variable assignments on lines 143-144 do not properly validate that
provided command-line arguments are strictly positive values. Currently, zero
and negative values can pass through, and zero is incorrectly treated as "use
default". Modify both the l1_size and l3_size assignments to explicitly check if
the retrieved argument value is None (in which case use the default), and if it
is not None, validate that it is greater than zero before converting to int. If
validation fails (value is not None and not greater than zero), raise an
appropriate error to prevent invalid memory configurations from being created
downstream.
DeeployTest/Platforms/XDNA2/main.cpp (1)
93-102: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Positional argument parsing breaks when flags come first.

Example: main -v custom.xclbin custom.bin assigns custom.xclbin to instr_path and keeps default xclbin_path.
Suggested patch
-  if (argc >= 2 && argv[1][0] != '-')
-    xclbin_path = argv[1];
-  if (argc >= 3 && argv[2][0] != '-')
-    instr_path = argv[2];
+  std::vector<std::string> positional;
+  for (int i = 1; i < argc; ++i) {
+    if (argv[i][0] != '-')
+      positional.emplace_back(argv[i]);
+  }
+  if (!positional.empty())
+    xclbin_path = positional[0];
+  if (positional.size() > 1)
+    instr_path = positional[1];
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@DeeployTest/Platforms/XDNA2/main.cpp` around lines 93 - 102, The current
argument parsing assumes positional arguments are always at fixed positions
(argv[1] and argv[2]), which fails when flags like -v appear first. Instead of
checking specific argv indices for xclbin_path and instr_path, refactor the
argument parsing loop to track positional arguments (those not starting with
'-') separately and assign them in order: the first positional argument to
xclbin_path and the second positional argument to instr_path. This allows flags
to appear in any position without breaking the assignment of positional
arguments.
DeeployTest/testUtils/core/execution.py (1)
205-210: ⚠️ Potential issue | 🔴 Critical
🧩 Analysis chain

🏁 Script executed:
#!/bin/bash
# Verify no shell=True process launch remains in this execution path
rg -n -C2 'subprocess\.Popen$|shell\s*=' DeeployTest/testUtils/core/execution.py

# Verify env variables set for simulation are propagated to Popen
rg -n -C3 'env\s*=\s*os\.environ\.copy\($|BANSHEE_LOG|VERBOSE|Popen\(' DeeployTest/testUtils/core/execution.py
Repository: pulp-platform/Deeploy

Length of output: 1671
Use argv execution (no shell) and pass env to preserve runtime behavior.

Lines 205–210 rebuild a shell command string and execute with shell=True, reintroducing command-injection risk. Additionally, the env variables prepared on lines 191–202 (VERBOSE, BANSHEE_LOG) are never passed to Popen, so those runtime configurations are dropped.
Suggested fix
-    cmd_str = " ".join(cmd)
-    with subprocess.Popen(cmd_str,
+    with subprocess.Popen(cmd,
                           stdout = subprocess.PIPE,
                           stderr = subprocess.STDOUT,
-                          shell = True,
-                          encoding = 'utf-8') as process:
+                          shell = False,
+                          encoding = 'utf-8',
+                          env = env) as process:
🧰 Tools

🪛 ast-grep (0.43.0)

[error] 205-209: Use of unsanitized data to create processes
Context: subprocess.Popen(cmd_str,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
shell = True,
encoding = 'utf-8')
Note: [CWE-78].

(os-system-unsanitized-data)

[error] 205-209: Command coming from incoming request
Context: subprocess.Popen(cmd_str,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
shell = True,
encoding = 'utf-8')
Note: [CWE-20].

(subprocess-from-request)

🪛 OpenGrep (1.22.0)

[ERROR] 206-210: Dynamic command passed to subprocess with shell=True. Use a command list without shell=True, or use shlex.quote() to sanitize input.

(coderabbit.command-injection.python-shell-true)

🪛 Ruff (0.15.17)

[error] 206-206: subprocess call with shell=True identified, security issue

(S602)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@DeeployTest/testUtils/core/execution.py` around lines 205 - 210, The
subprocess.Popen call on lines 205-210 is using shell=True with a joined command
string, which reintroduces command-injection security risk, and is not passing
the env parameter to preserve the VERBOSE and BANSHEE_LOG environment variables
that were prepared earlier on lines 191-202. Remove the cmd_str variable
assignment that joins the cmd list with spaces, pass the cmd list directly to
subprocess.Popen without shell=True, and add the env parameter to the Popen call
to pass through the prepared environment variables so the runtime configurations
are not dropped.
Source: Linters/SAST tools
DeeployTest/testUtils/platformMapping.py (1)
286-287: ⚠️ Potential issue | 🟡 Minor
🧩 Analysis chain

🏁 Script executed:
#!/bin/bash
ruff check --select B904 DeeployTest/testUtils/platformMapping.py
Repository: pulp-platform/Deeploy

Length of output: 840
Preserve the original ImportError cause when re-raising.

When re-raising exceptions in an except block, chain the exception to maintain the original error context. This helps with debugging missing optional dependencies.
Proposed fix
-        except ImportError:
-            raise RuntimeError(f"Deployer for platform {platform} is not implemented")
+        except ImportError as err:
+            raise RuntimeError(f"Deployer for platform {platform} is not implemented") from err
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
        except ImportError as err:
            raise RuntimeError(f"Deployer for platform {platform} is not implemented") from err
🧰 Tools

🪛 Ruff (0.15.17)

[warning] 287-287: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@DeeployTest/testUtils/platformMapping.py` around lines 286 - 287, The except
ImportError block raises a RuntimeError without preserving the original
exception cause, losing valuable debugging information for missing optional
dependencies. Capture the ImportError exception by adding `as e` to the except
clause, then modify the raise statement to use `raise RuntimeError(f"Deployer
for platform {platform} is not implemented") from e` to chain the exceptions and
maintain the original error context.
Source: Linters/SAST tools
requirements-xdna.txt (1)
5-7: ⚠️ Potential issue | 🟠 Major
🧩 Analysis chain

🌐 Web query:

Does pip expect --extra-index-urltargets to be PEP 503 "simple" indexes, and should GitHub Releasesexpanded_assetspages be consumed via--find-links instead?

💡 Result:

Yes, pip explicitly expects targets provided via --extra-index-url to be PEP 503 compliant simple repositories [1][2][3]. A PEP 503 index must be a repository that serves an HTML page for each project, where the page contains anchor tags pointing to the package files (wheels or sdists) [4][5]. You should not consume GitHub Releases pages directly via --extra-index-url because those pages are standard web interfaces for users, not the machine-readable simple repository API required by pip [4][5]. Instead, you should use --find-links if you want to point pip toward a location that contains direct links to package files, such as a directory or an HTML page that is not necessarily a full PEP 503 index [6][7]. However, note that --find-links is intended for looking for archives (like.whl or.tar.gz files) and does not support the same structured discovery and version resolution that a full PEP 503 repository provides [6][7]. If you intend to use GitHub as a package host, the recommended approach is to serve a static, PEP 503-compliant index (often generated by tools that crawl your releases) via GitHub Pages, rather than pointing pip directly at the GitHub Releases interface [8].

Citations:

1: https://pip.pypa.io/en/stable/cli/pip_install.html

2: https://pip.pypa.io/en/stable/cli/pip%5Findex/

3: https://pip.pypa.io/en/latest/cli/pip_install/

4: https://pip.pypa.io/en/latest/development/architecture/package-finding/

5: https://pip.pypa.io/en/stable/development/architecture/package-finding/

6: https://stackoverflow.com/questions/46651454/whats-the-difference-between-find-links-and-index-url-pip-flags

7: https://pip.pypa.io/en/latest/cli/pip_index/

8: https://www.bomberbot.com/python/leveraging-github-as-a-pypi-server-an-in-depth-guide/

🏁 Script executed:
# Search for any documentation in the repository about how to use these packages
find . -type f $ -name "*.md" -o -name "*.txt" -o -name "*.rst" $ -exec grep -l "mlir-aie\|llvm-aie\|expanded_assets" {} \;
Repository: pulp-platform/Deeploy

Length of output: 188

🌐 Web query:

Xilinx mlir-aie llvm-aie pip installation requirements GitHub releases recommended configuration

💡 Result:

To install and configure the Xilinx MLIR-AIE and LLVM-AIE (Peano compiler) toolchains, follow these requirements and steps. Prerequisites Supported Python versions include 3.10, 3.11, 3.12, 3.13, and 3.14 [1][2][3]. You must have the following system packages installed on Linux/WSL [4][2]: - build-essential, clang, clang-14, lld, lld-14 - cmake (version 3.30 or newer required) [5][1] - ninja-build, python3-venv, python3-pip Installation Steps The project relies on pip wheels hosted on GitHub releases [5][1]. It is critical to synchronize the mlir-aie wheel version, the repository commit, and any dependencies [5][2]. 1. Set up a virtual environment: python3 -m venv ironenv source ironenv/bin/activate python3 -m pip install --upgrade pip 2. Install CMake (if the system version is below 3.30): python3 -m pip install --upgrade cmake 3. Install the mlir-aie wheel: For the latest version (not necessarily a formal release): python3 -m pip install mlir_aie -f https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-3 For a specific version, locate the tag on the GitHub releases page and use: python3 -m pip install mlir_aie== -f https://github.com/Xilinx/mlir-aie/releases/expanded_assets/v 4. Install the Peano compiler (llvm-aie): python3 -m pip install llvm-aie -f https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly Recommended Configuration After installation, you must configure your environment variables to ensure the tools and libraries are discoverable [6][7][8]. This is typically handled by sourcing utility scripts provided in the repository (e.g., utils/env_setup.sh), which sets paths for the compiler binaries, libraries, and Python modules [6][7][8]. For Ryzen AI application development, ensure you have your XRT (Xilinx Runtime) environment sourced before the MLIR-AIE environment [6][7]. Note that installing from wheels via this process supports AIE-ML (AIE2) and AIE2P, but does not support Versal devices with classic AIE [6][7]. For development and testing, you may also need to install the project's development requirements: python3 -m pip install -r python/requirements_dev.txt [4][1]

Citations:

1: https://github.com/xilinx/mlir-aie

2: https://github.com/Xilinx/mlir-aie/blob/main/README.md

3: https://github.com/Xilinx/mlir-aie/releases/tag/v1.3.0

4: https://xilinx.github.io/mlir-aie/Building.html

5: https://xilinx.github.io/mlir-aie/

6: https://github.com/Xilinx/mlir-aie/blob/main/docs/buildHostLin.md

7: https://xilinx.github.io/mlir-aie/buildHostWin.html

8: https://xilinx.github.io/mlir-air/buildingRyzenLin.html

🌐 Web query:

GitHub releases expanded_assets pip --find-links vs --extra-index-url compatibility

💡 Result:

The flags --find-links and --extra-index-url are both fully compatible and designed to work together in pip to expand the search space for packages beyond the default index [1][2]. Pip does not treat these locations as having a hierarchical priority; instead, it collects all available candidates from all specified locations (the default index, any extra indexes, and any find-links locations) and selects the "best" match based primarily on version number [1][3]. Key differences in their behavior include: 1. Repository Format: The --extra-index-url flag expects a repository that follows the PEP 503 Simple Repository API (a predictable URL structure containing HTML pages of links) [4][2]. In contrast, --find-links can point to a flat local directory of files, an HTML page containing links to archives, or a remote repository, making it more flexible for non-standard or local setups [1][5]. 2. Search Mechanism: Pip's PackageFinder gathers candidates from all configured sources [2][6]. Because pip prioritizes the highest version number found across all sources rather than the source itself, this behavior can lead to a security issue known as "dependency confusion" if a malicious actor publishes a higher version of a package to a public index that you also intend to source privately [1][7]. 3. Recommendations: Because pip does not natively support prioritizing one source over another for equivalent versions, the official recommendation to avoid dependency confusion is to use --no-index in combination with --find-links for local assets, or to use a private, pass-through index server that explicitly controls which packages are served [1][7]. Using --extra-index-url for private packages is considered unsafe without additional safeguards [1][8].

Citations:

1: https://pip.pypa.io/en/stable/cli/pip_install.html

2: https://pip.pypa.io/en/stable/development/architecture/package-finding/

3: https://pip.pypa.io/en/stable/cli/pip_install/

4: https://pip.pypa.io/en/stable/cli/pip%5Findex/

5: https://stackoverflow.com/questions/46651454/whats-the-difference-between-find-links-and-index-url-pip-flags

6: https://pip.pypa.io/en/latest/_sources/development/architecture/package-finding.rst.txt

7: pypa/pip#9959

8: https://pip.pypa.io/en/latest/cli/pip_install/

🏁 Script executed:
cat -n ./requirements-xdna.txt
Repository: pulp-platform/Deeploy

Length of output: 476
Use --find-links for GitHub release wheels, but keep PyPI as an extra index.

Lines 5-6 use --extra-index-url for GitHub expanded_assets pages, which are not PEP 503 compliant indexes. Xilinx's official documentation recommends using --find-links instead. However, the PyPI index on line 7 must be retained as an extra index to resolve transitive dependencies of mlir_aie and llvm-aie.
Suggested fix
---extra-index-url https://github.com/Xilinx/mlir-aie/releases/expanded_assets/v1.3.2
---extra-index-url https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly
 --extra-index-url https://pypi.org/simple
+--find-links https://github.com/Xilinx/mlir-aie/releases/expanded_assets/v1.3.2
+--find-links https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@requirements-xdna.txt` around lines 5 - 7, The GitHub expanded_assets URLs on
lines 5-6 are not PEP 503 compliant indexes and should use `--find-links`
instead of `--extra-index-url`. Replace `--extra-index-url` with `--find-links`
for both the mlir-aie and llvm-aie GitHub release URLs. Keep the PyPI URL on
line 7 as `--extra-index-url` since it needs to remain a proper index to resolve
transitive dependencies for the Xilinx packages.

Victor-Jung · 2026-06-17T13:58:23Z

Thanks for implementing the changes! I think we are almost there. I resolved most of the comments. Please check the remaining ones, rebase your work, update the Changelog, and let Claude extend the PR description with changes, fixes, and added stuff. Once the CI passes, I can take another quick look and hopefully merge it.

Done 😁

Victor-Jung added this to the Release 0.2.2 milestone Mar 24, 2026

Victor-Jung self-assigned this Mar 24, 2026

Victor-Jung added the Feature Addition of new features label Mar 24, 2026

Victor-Jung added this to Deeploy Mar 24, 2026

Victor-Jung mentioned this pull request Apr 7, 2026

Tracing Support in XDNA2 #181

Draft

5 tasks

Victor-Jung changed the title ~~DRAFT: Beta Support for XDNA2 Platform~~ Beta Support for XDNA2 Platform Apr 9, 2026

Victor-Jung marked this pull request as ready for review April 9, 2026 11:25

Victor-Jung requested review from Xeratec and runwangdl as code owners April 9, 2026 11:25

coderabbitai Bot reviewed Apr 9, 2026

View reviewed changes

Xeratec previously requested changes May 7, 2026

View reviewed changes

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

Comment thread .github/workflows/_runner-xdna2.yml

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

Comment thread DeeployTest/generateNetwork_xdna2.py

Comment thread DeeployTest/generateNetwork_xdna2.py

Comment thread DeeployTest/Platforms/XDNA2/CMakeLists.txt

Comment thread README_XDNA.md

Xeratec self-requested a review May 21, 2026 15:18

Xeratec moved this to In review in Deeploy May 21, 2026

Victor-Jung force-pushed the pr/xdna-support branch from cde0c5d to 26f04f7 Compare June 17, 2026 08:41

Victor-Jung added 11 commits June 17, 2026 10:52

Update gitignore

9ce2363

XDNA2 Platform Beta Support

a937482

Add XDNA container

19e36c4

First attempt at generating MLIR code with Deeploy

5645eca

Generate tiled code but too much logic is in the Template

198df6a

Move data movement in passes. Template represent for loop and aquire/…

f550712

…release

Template is agnostic of tiling and data movement that are handled by …

695a78a

…code transformation passes

Add CI on self hosted runner

041f4f6

Remove unecessary install

ec8dbf5

Add cleanup step before checkout to fix permission

edce995

aie import is optional to not enforce mlir-aie and llvm-aie package i…

b700881

…nstallation

Victor-Jung and others added 15 commits June 17, 2026 10:53

Decouple xdna requirements from dev requirements

849c26e

Format

aa85772

Format

2e5acff

Add general todos for future refactoring

338cd9b

Format

0ac57e6

Free output tasks in RT sequence

ce90206

Add -v flag for XDNA platform only.

d565dfe

Pin llvm-aie version

1d80113

Add remainder loop to Add kernel

97746fe

Use workdir to easily discard build-time files

e90e0ed

Bump mlir-aie to v1.3.2 and use llvm-aie nightly

8a8e4f2

Add docker action for XDNA

7825a61

Use exernal_function instead of link_with to respect new norm from ML…

b42625e

…IR-AIE repo

Remove redundent mapping

8b4c504

Format

d46c45f

Victor-Jung force-pushed the pr/xdna-support branch from 26f04f7 to d46c45f Compare June 17, 2026 08:55

Update Changelog

b8e9e34

coderabbitai Bot reviewed Jun 17, 2026

View reviewed changes

Conversation

Victor-Jung commented Mar 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Added

Changed

PR Merge Checklist

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Xeratec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Xeratec May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Victor-Jung May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Xeratec May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Victor-Jung Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Xeratec May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Victor-Jung Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Xeratec commented May 21, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Victor-Jung commented Mar 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading